Methods and compositions related to tagging of membrane surface proteins

ABSTRACT

This invention relates to methods and reagents for selectively labeling membrane surface proteins using a labeling agent. The label may be used to isolate preparations of membrane surface proteins. Preparations of membrane surface proteins may be analysed by a variety of high-throughput techniques to allow rapid profiling of membrane surface protein composition.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to U.S. Provisional Application No.60/296,334, filed Jun. 6, 2001 and incorporated by reference herein inits entirety.

BACKGROUND

Proteins associated with the plasma membrane constitute a significantand functionally important fraction of the proteins in a cell. Keyfunctions, such as the communication of a cell with its environment, arelargely dependent on membrane proteins. Membrane proteins are targets ofchoice for pharmaceuticals in part because of their exposure to theextracellular environment. Furthermore, cell surface proteins areexcellent markers for use in cell sorting and identification becausecells need not be damaged in order to detect these proteins.

The clinical importance of membrane proteins may be illustrated throughan examination of the diagnosis and treatment of various cancers. Forexample, cancer therapeutics are notorious for their severe sideeffects, which result largely from a lack of specificity. Most cancertherapeutics target processes that are common to all growing cells andtherefore cause serious damage to healthy cells in addition to cancerouscells. Substantial research has been devoted to identifyingdistinguishing features of cancer cells that may be used to selectivelytarget therapeutic substances. Cancer research has also focused on theprecise tailoring of therapeutic regimen to specific tumor types, withthe goal of maximizing efficacy and minimizing toxicity. Improvements incancer classification and the identification of distinctive markers forcancer types are therefore critical to advances in cancer treatment.

Cancers have traditionally been classified primarily on morphologicalappearance. However, tumors with similar morphology can followsignificantly different clinical courses and show different responses totherapy. In a few cases, such clinical heterogeneity has been explainedby dividing morphologically similar tumors into subtypes with distinctpathogeneses. Acute leukemias and non-Hodgkin's lymphomas, have beenmolecularly subclassified with substantial improvement in treatmentefficacy. Important subclasses are likely to exist for many more tumorsbut have not yet to been defined by molecular markers. For example,prostate cancers of identical grade can have widely variable clinicalcourses. Large scale profiling of membrane proteins would provide useful“fingerprints” for the classification of cancers, and, in addition,membrane proteins unique to certain cancers could be used as targets fortherapeutics or as homing signals to specifically deliver therapeuticsto the appropriate cell types.

In addition to the plasma membrane, cells contain an extensive networkof intracellular membranes, including the membranes surrounding thevarious organelles. Membrane proteins located on these intracellular areoften involved in mediating interactions between the cell and theorganelles, and as such represent attractive targets for research.

Membrane-embedded proteins are difficult to characterize with currentmethodologies. Membrane proteins are more difficult to extract due totheir highly hydrophobic nature and lower solubility. The low solubilityof these hydrophobic proteins, especially those of high molecularweight, gives rise to protein aggregation. Furthermore, membraneproteins are often present at relatively low abundance, making theidentification of membrane proteins by, for example, microsequencingtechniques, a challenging task.

It would be advantageous to have improved methods and reagents for thepreparation and/or detection of cell surface protein, for example byimproving the representation of cell surface proteins in proteinextracts to facilitate further identification and analysis.

SUMMARY OF THE INVENTION

In general, the invention provides methods for selectively preparing awide range of membrane proteins, e.g. by labeling, enriching, analyzingand/or identifying membrane surface proteins, in the field of proteomicsresearch. Preparations of membrane surface proteins generated by methodsof the invention may be subjected to a variety of analytic techniques togenerate profiles of these membrane surface proteins.

In one aspect, the invention provides methods for selectively labelingmembrane surface proteins, and preferably cell surface proteins. Incertain embodiments, methods of the invention comprise contacting a cellwith a labeling agent to generate a plurality of labeled cell surfaceproteins. Labeling agents of the invention generally comprise a proteinbinding moiety and a marking moiety, wherein the protein binding moietyis capable of interacting covalently or non-covalently with a broadrange of cell surface proteins, and wherein the marking moiety is usefulin detecting proteins associated with the labeling agent. The proteinbinding moiety and marking moiety may, in certain instances, be presentin a single, multifunctional moiety. Optionally, a protein bindingmoiety covalently binds to cysteins, glycans and/or amino groups, suchas the ε-amino groups of lysine.

In certain embodiments, the properties of the labeling agent may be usedto separate labeled proteins from unlabeled proteins. Labeled proteinsmay be processed by a variety of methods including gel electrophoresisand chromatography. Labeled proteins may also be analyzed and/oridentified by techniques including, but not limited to, two-dimensionalgel electrophoresis, antibody-based techniques, protein identificationarrays, mass spectrometry, protein sequencing, etc. In certainembodiments, the data obtained from the identification and/or analysisof cell or membrane surface proteins forms a cell or membrane surfaceprotein profile. Such profiles may be generated for a plurality ofsample types. For example, in certain embodiments, cell and membranesurface protein profiles may be generated and compared across a varietyof healthy and disordered cells, including cell lines and culturedcells. In other embodiments, profiles may also be compared for stemcells and more differentiated cells. The comparison of cell or membranesurface protein profiles will be useful for a variety of purposesincluding, but not limited to, diagnostics, cell identification andsorting, screening for therapeutics, identifying cell surface proteinsthat are indicative of certain biological conditions, etc.

In a further aspect, the invention provides methods for differentialdisplay of membrane surface proteins. Such methods generally involveselecting two or more samples to be analyzed. Each sample is treatedwith a labeling agent. Preferably the labeling agents are identicalexcept that the marking moieties will be selected so as to bedistinguishable. For example, a first labeling agent may comprise afirst fluorescent agent modified according to the methods of theinvention to become substantially membrane impermeable, and a secondlabeling agent may comprise a second fluorescent agent which was alsomade to be substantially membrane impermeable according to the method ofthe invention, the second fluorescent agent having fluorescentproperties (e.g. excitation spectrum, emission spectrum, fluorescenceefficiency, etc.) that are distinguishable from those of the firstfluorescent agent. After labeling, proteins from each sample may bemixed and subjected to all further analysis together. For example, theproteins may be mixed and subjected to two-dimensional electrophoresis.In this example, the protein spots on the gel are analyzed for abundanceof each fluorescent moiety to provide a direct comparison of proteinabundance in the different samples. In certain embodiments differentialdisplay methods described herein may be used with more than two samples,so long as each sample is labeled with a distinguishable marker. Forexample, three samples may be differentially labeled with red, green andblue fluorescing moieties, mixed and analyzed to provide a differentialdisplay of the relative membrane surface protein abundance in eachsample.

In a further embodiment, the invention provides reagents to be used inmethods of the invention. Exemplary specific labeling agents aresubstantially membrane impermeable, and therefore enable selectivemodification of cell surface proteins. Certain labeling agents of theinvention comprise a reversible bond, that facilitates removal of asubstantial portion of the labeling agent from the labeled protein,which may, in certain embodiments, facilitate separation and/oridentification of labeled proteins. In some embodiments of the inventionthe labeling agent is not a biomolecule and may therefore have a reducedtendency to form non-specific interactions with other proteins.

In certain embodiments, labeling agents of the present invention arerepresented by structure 1:

-   -   wherein:    -   R is present 1 to 4 times;    -   R is selected from the group consisting of —B(OH)₂,    -   W is a linker selected from the group consisting of N(R₂)CO,        CON(R₂), N(R₂)COC(R₂)₂, CON(R₂)C(R₂)₂, O, OC(R₂)₂, S, and        S(R₂)₂;    -   Z is a spacer selected from the group consisting of a saturated        or unsaturated chain up to about 6 carbon equivalents in length,        unbranched saturated or unsaturated chain of from about 6 to 18        carbon equivalents in length with at least one intermediate        amide or disulfide moiety, and a polyethylene glycol chain of        from about 3 to 12 carbon equivalents in length;    -   R₁ is a reactive electrophilic or nucleophilic moiety suitable        for reaction of the PDAB (phenyldiboronic acid) with a protein;        and    -   R₂ is H, alkyl, or aryl.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein Z contains adisulfide moiety.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is —B(OH)₂, Wis NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to6 inclusively, and R₁ is a hydrazide of structure A:

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1to 6 inclusively, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1to 6 inclusively, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is —B(OH)₂, Wis CONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to6 inclusively, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is CONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1to 6 inclusively, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is CONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1to 6 inclusively, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is —B(OH)₂, Wis CH₂NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1to 6 inclusively, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is CH₂NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from1 to 6 inclusively, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is CH₂NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from1 to 6 inclusively, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is —B(OH)₂, Wis NHCO, Z is (CH₂)_(n) wherein n is an integer from 1 to 6 inclusively,and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is NHCO, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is NHCO, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is —B(OH)₂, Wis CONH, Z is (CH₂)_(n) wherein n is an integer from 1 to 6 inclusively,and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is CONH, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is CONH, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is —B(OH)₂, Wis CH₂NHCO, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is CH₂NHCO, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is CH₂NHCO, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is —B(OH)₂, Wis CH₂NHCO, Z is (CH₂)_(n)C(O)NH(CH₂)_(n) wherein n is an integer from 1to 6 inclusively, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B:

B.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is —B(OH)₂, Wis CH₂NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1to 6 inclusively, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is CH₂NHCO, Z is (CH₂)_(n)C(O)NH(CH₂)_(n) wherein n is an integer from1 to 6 inclusively, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is CH₂NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from1 to 6 inclusively, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is CONH, Z is (CH₂)₅, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is —B(OH)₂, Wis CONH, Z is (CH₂)₅, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is NHCO, Z is (CH₂)₂C(O)NH(CH₂)₅, and R₁ is ahydroxysulfo-succinimidyl ester of structure B.

In certain embodiments, the labeling agents of the present invention areof structure 1 and the accompanying definitions, wherein R is

W is NHCO, Z is (CH₂)₂, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.

In certain embodiments, labeling agents of the present invention arerepresented by structure 2:

-   -   wherein:    -   R₃ is present 1 or 2 times and is OH;    -   D is selected from the group consisting of O, S, and NH;    -   Q is selected from the group consisting of OR₂, NHR₂, NHOR₂, and        CH₂-EWG, wherein EWG is an electron withdrawing group, such as        CN, COOH, etc.;    -   W is a linker selected from the group consisting of N(R₂)CO,        CON(R₂), N(R₂)COC(R₂)₂, CON(R₂)C(R₂)₂, O, OC(R₂)₂, S, and        S(R₂)₂;    -   Z is a spacer selected from the group consisting of a saturated        or unsaturated chain up to about 6 carbon equivalents in length,        unbranched saturated or unsaturated chain of from about 6 to 18        carbon equivalents in length with at least one intermediate        amide or disulfide moiety, and a polyethylene glycol chain of        from about 3 to 12 carbon equivalents in length;    -   R₁ is a reactive electrophilic or nucleophilic moiety suitable        for reaction of the PDAB (phenyldiboronic acid) with a protein;        and    -   R₂ is H, alkyl, or aryl.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein Z contains adisulfide moiety.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time W is NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integerfrom 1 to 6 inclusively, Q is OR₂ and R₁ is a hydrazide of structure A:

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time, W is NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is NHOR₂, and R₁ is a hydrazide ofstructure A.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is OR₂, and R₁ is a hydrazide ofstructure A.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is NHOR₂, and R₁ is a hydrazide ofstructure A.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time, W is CONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is OR₂, and R₁ is a hydrazide ofstructure A.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time, W is CONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is NHOR₂, and R₁ is a hydrazide ofstructure A.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is CONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is OR₂, and R₁ is a hydrazide ofstructure A.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is CONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is NHOR₂, and R₁ is a hydrazide ofstructure A.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time, W is CONH, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is OR₂, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time, W is CONH, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is NHOR₂, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is CONH, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is OR₂, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is CONH, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is NHOR₂, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time, W is NHCO, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is OR₂, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time, W is NHCO, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is NHOR₂, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is NHCO, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is OR₂, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is NHCO, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is NHOR₂, and R₁ is a hydrazide of structure A.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time W is NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integerfrom 1 to 6 inclusively, Q is OR₂ and R₁ is a hydrazide of structure B:

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time, W is NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is NHOR₂, and R₁ is ahydroxysulfo-succinimidyl ester of structure B.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is OR₂, and R₁ is ahydroxysulfo-succinimidyl ester of structure B.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is NHOR₂, and R₁ is ahydroxysulfo-succinimidyl ester of structure B.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time, W is CONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is OR₂, and R₁ is ahydroxysulfo-succinimidyl ester of structure B.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time, W is CONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is NHOR₂, and R₁ is ahydroxysulfo-succinimidyl ester of structure B.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is CONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is OR₂, and R₁ is ahydroxysulfo-succinimidyl ester of structure B.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is CONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is NHOR₂, and R₁ is ahydroxysulfo-succinimidyl ester of structure B.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time, W is CONH, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is OR₂, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time, W is CONH, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is NHOR₂, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is CONH, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is OR₂, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is CONH, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is NHOR₂, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time, W is NHCO, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is OR₂, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presentone time, W is NHCO, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is NHOR₂, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is NHCO, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is OR₂, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.

In certain embodiments, the labeling agents of the present invention areof structure 2 and the accompanying definitions, wherein R is presenttwo times, W is NHCO, Z is (CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is NHOR₂, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.

Various embodiments are described in the claims, and all suchembodiments hereby incorporated into the specifcation.

The practice of the present invention will employ, unless otherwiseindicated, conventional techniques of chemistry, cell biology, cellculture, molecular biology, transgenic biology, microbiology,recombinant DNA, and immunology, which are within the skill of the art.Such techniques are explained fully in the literature. See, for example,“Bioconjugate Techniques”, G T Hermanson, Academic Press (1996);Molecular Cloning A Laboratory Manual, 2nd Ed., ed. by Sambrook, Fritschand Maniatis (Cold Spring Harbor Laboratory Press: 1989); DNA Cloning,Volumes I and II (D. N. Glover ed., 1985); Oligonucleotide Synthesis (M.J. Gait ed., 1984); Mullis et al. U.S. Pat. No. 4,683,195; Nucleic AcidHybridization (B. D. Hames & S. J. Higgins eds. 1984); Transcription AndTranslation (B. D. Hames & S. J. Higgins eds. 1984); Culture Of AnimalCells (R. I. Freshney, Alan R. Liss, Inc., 1987); Immobilized Cells AndEnzymes (IRL Press, 1986); B. Perbal, A Practical Guide To MolecularCloning (1984); the treatise, Methods In Enzymology (Academic Press,Inc., N.Y.); Gene Transfer Vectors For Mammalian Cells (J. H. Miller andM. P. Calos eds., 1987, Cold Spring Harbor Laboratory); Methods InEnzymology, Vols. 154 and 155 (Wu et al. eds.), Immunochemical MethodsIn Cell And Molecular Biology (Mayer and Walker, eds., Academic Press,London, 1987); Handbook Of Experimental Immunology, Volumes I-IV (D. M.Weir and C. C. Blackwell, eds., 1986); Manipulating the Mouse Embryo,(Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986).

Other features and advantages of the invention will be apparent from thefollowing detailed description, and from the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1: A flowchart illustrating exemplary methodologies for theprofiling of cell surface proteins.

FIG. 2: A flowchart illustrating exemplary methodologies for themultiple labeling and profiling of membrane surface proteins.

FIG. 3: Exemplary labeling agents comprising a phenylboronic acid(“PBA”) type marking moiety.

FIG. 4: Exemplary labeling agents comprising a PBA type marking moiety.

FIG. 5: Exemplary labeling agents comprising a PBA type marking moiety.

FIG. 6: Exemplary labeling agents comprising a salicylhydroxamic acid(“SHA”) marking moiety.

FIG. 7: Exemplary labeling agents comprising an SHA marking moiety.

FIG. 8: Exemplary labeling agents comprising an SHA marling moiety.

DETAILED DESCRIPTION OF THE INVENTION

1. Definitions

For convenience, certain terms employed in the specification, examples,and appended claims are collected here. Unless defined otherwise, alltechnical and scientific terms used herein have the same meaning ascommonly understood by one of ordinary skill in the art to which thisinvention belongs.

The articles “a” and “an” are used herein to refer to one or to morethan one (i.e., to at least one) of the grammatical object of thearticle. By way of example, “an element” means one element or more thanone element.

The term “biological state” is used herein to refer to essentially anybiologically relevant characteristic of a cell or tissue sample.“Biological state” may refer to the presence or absence of a diseasecondition, a tissue type, a developmental stage, an effect on a tissueor cell caused by a therapeutic or other biologically active compound,etc.

A “cell sample” is any sample obtained from a biological source andcontaining cells. Cell samples are intended to encompass, withoutlimitation, solid or semi-solid tissue samples (eg. tumor biopsy, skinscraping, stool sample, etc.) as well as fluid samples (eg. blood,urine, cerebro-spinal fluid, saliva etc.). Cell samples also includecultured cells and cell lines. A “test cell sample” is a cell sample forwhich it is desirable to characterize a biological state. A “referencecell sample” is a cell sample which has been characterized with respectto a biological state. A “diseased cell sample” is a cell sampleaffected by a disorder, disease or abnormal state, including geneticallyor otherwise altered cell lines or cultured cells.

A “cell surface protein” is used herein to mean any protein that isexposed to the extracellular environment and associated with themembrane. Cell surface proteins include, but are not limited to,integral membrane proteins (i.e. proteins with one or more transmembranedomains), membrane-anchored proteins (i.e. proteins attached to themembrane through a lipophilic anchor), and membrane-associated proteins(i.e. proteins that have some affinity for the membrane but are notcovalently attached to a moiety that is inserted in the membrane).

A “cell surface protein profile” or “membrane surface protein profile”is used herein to indicate an aggregate of information regarding apreparation of cell or membrane surface proteins. A profile willcomprise, at minimum, information regarding the presence or absence ofsuch proteins. More typically, a profile will comprise informationregarding the presence or absence of a plurality of such proteins. Inaddition, a profile may contain other information about each identifiedprotein, such as relative or absolute amount of protein present, thedegree of post-translational modification, membrane topology,three-dimensional structure, isoelectric point, molecular weight, etc. A“test cell surface protein profile” is a cell surface protein profileobtained from a test cell sample. A “reference cell surface proteinprofile” is a cell surface protein profile obtained from a referencecell sample.

A “chimeric protein” or “fusion protein” is a fusion of a first aminoacid sequence encoding a polypeptide with a second amino acid sequenceheterologous to the first amino acid sequence.

“Closed membrane structures” are membrane structures that aretopologically configured so as to create at least two chemicallydistinguishable compartments: an inside and an outside. Closed membranestructures include, but are not limited to, membrane vesicles (whetherartificial or obtained from a biological sample), cells and organellessuch as mitochondria, lysosomes, peroxisomes, chloroplasts, endosomes,etc.

The term “comprising” is used in the inclusive, open sense, meaning thatadditional elements may be included.

The term “divalent ion chelator” is used herein to refer to compoundsthat bind with high affinity (having a dissociation constant undernormal biochemical conditions of less than about 10-10 nM) to one ormore divalent ions, such as, for example, Ca2+, Mg2+, Fe2+, etc.

The term “including” is used herein to mean “including but not limitedto”. “Including” and “including but not limited to” are usedinterchangeably.

The term “isolated”, as used herein with reference to the subjectproteins and protein complexes, refers to a preparation of protein orprotein complex that is essentially free from contaminating proteinsthat normally would be present in association with the protein orcomplex, e.g., in the cellular milieu in which the protein or complex isfound endogenously. Thus, an isolated protein complex is isolated fromcellular components that normally would “contaminate” or interfere withthe study of the complex in isolation, for instance while screening formodulators thereof.

A “marking moiety” is essentially any molecular moiety that can be used,directly or indirectly, to detect those proteins that are bound to alabeling agent, e.g. by providing a directly detectable moiety such as afluorescent moiety, a radioactive moiety, etc., or by serving as anaffinity capturing agent, such as a biotin (for capture by, e.g., anavidin), a sulfhydryl (for capture by e.g., another sulfhydryl), aphenylboronic acid (“PBA”) (for capture by, e.g., a salicylhydroxamicacid), a salicylhydroxamic acid (“SHA”) (for capture by, e.g., aphenylboronic acid), etc. Marking moieties are joined to protein bindingmoieties to form labeling agents.

A “membrane surface protein” is used herein to refer to a protein thatis exposed to the environment on the external side of a closed membranestructure. Membrane surface proteins include, but are not limited to,integral membrane proteins (i.e. proteins with one or more transmembranedomains), membrane-anchored proteins (i.e. proteins attached to themembrane through a lipophilic anchor), and membrane-associated proteins(i.e. proteins that have some affinity for the membrane but are notcovalently attached to a moiety that is inserted in the membrane).

The terms “proteins” and “polypeptides” are used interchangeably herein.

A “protein binding moiety” or “binding moiety” is a molecular moietythat is capable of interacting, covalently or non-covalently, with abroad range of proteins. Exemplary classes of protein binding moietiesinclude lectins, and amide- or thiol-reactive agents. Protein bindingmoieties are joined with marking moieties to form labeling agents Theterm “purified protein” refers to a preparation of a protein or proteinswhich are preferably isolated from, or otherwise substantially free of,other proteins normally associated with the protein(s) in a cell or celllysate. The term “substantially free of other cellular proteins” (alsoreferred to herein as “substantially free of other contaminatingproteins”) is defined as encompassing individual preparations of each ofthe component proteins comprising less than 20% (by dry weight)contaminating protein, and preferably comprises less than 5%contaminating protein By “purified”, it is meant, when referring tocomponent protein preparations used to generate a reconstituted proteinmixture, that the indicated molecule is present in the substantialabsence of other biological macromolecules, such as other proteins(particularly other proteins which may substantially mask, diminish,confuse or alter the characteristics of the component proteins either aspurified preparations or in their function in the subject reconstitutedmixture). The term “purified” as used herein preferably means at least80% by dry weight, more preferably in the range of 85% by weight, morepreferably 95-99% by weight, and most preferably at least 99.8% byweight, of biological macromolecules of the same type present (butwater, buffers, and other small molecules, especially molecules having amolecular weight of less than 5000, can be present). The term “pure” asused herein preferably has the same numerical limits as “purified”immediately above.

The term “reversible bond” includes covalent bonds that are reversibleunder conditions that are relatively gentle with respect to polypeptides(e.g. a pH that does not cause peptide bond hydrolysis, reducingconditions that do not cause substantial modifications to amino acidside chains other than a sulfhydryl, etc.). A disulfide bond is anexemplary reversible bond.

The term “selective” as used in reference to the tagging of membranesurface proteins, is intended to indicate that the labeling agent, whenused according to methods described herein, primarily labels membranesurface proteins and not other types of proteins, such as cytoplasmicproteins. “Selective” may indicate that more than 70% of tagged proteinsare membrane surface proteins (i.e. the mass of tagged proteins that areknown to be membrane surface proteins divided by the mass of taggedproteins is greater than 0.7). In other embodiments, “selective”indicates that more than 80%, more than 90% or more than 95% percent oftagged proteins are membrane surface proteins. The percentage of taggedproteins that are membrane proteins may be assessed by examining arepresentative sample of the tagged proteins.

The term “separating” is used herein to refer to any of a variety ofmethods that may be used to resolve a complex mixture of proteins intosimpler mixtures, or pure proteins, for identification. Separation mayinclude, but is not limited to, chromatography, gel electrophoresis (forexample two-dimensional gel electrophoresis), adherence to a proteinidentification array, and/or differential precipitation (or othermethods of protein purification), etc. For example, resolution of amixture of proteins into spots (some will be distinct, others will beless so) by two-dimensional gel electrophoresis is considered“separating”. As a further example, placing a mixture of proteins on aprotein identification array comprising an ordered array of antibodiesis considered “separating”, because different proteins adhere todifferent positions on the array.

“Small molecule” as used herein, is meant to refer to a composition,which has a molecular weight of less than about 5 kD and most preferablyless than about 2.5 kD. Small molecules can be nucleic acids, peptides,polypeptides, peptidomimetics, carbohydrates, lipids or other organic(carbon containing) or inorganic molecules. Many pharmaceuticalcompanies have extensive libraries of chemical and/or biologicalmixtures comprising arrays of small molecules, often fungal, bacterial,or algal extracts, which can be screened with any of the assays of theinvention.

The term “substantially membrane impermeable” as used in reference tolabeling agents means that the labeling agent, when employed in methodsdisclosed herein, is effective for selectively tagging membrane surfaceproteins.

The term “test compound” as used herein is meant to include, but is notlimited to, peptides, nucleic acids, carbohydrates, small organicmolecules, natural product extract libraries, and any other molecules(including, but not limited to, chemicals, metals and organometalliccompounds).

3. Membrane Labeling Methods

In certain aspects, the invention provides reagents and methods forselectively tagging proteins that are exposed to the extracellularenvironment. In certain embodiments, selective tagging may beaccomplished through the use of a labeling agent. In general, labelingagents have the following properties: (1) the ability to interactrelatively non-specifically, and covalently or non-covalently, with awide range of proteins; and (2) an inability to penetrate the cellmembrane or an inability to stably interact with intracellular proteins(i.e. a labeling agent that penetrates the cell but is destroyed orrendered inoperative by the intracellular environment may be effectivefor selectively labeling cell surface proteins). For example, lectinsbind to glycoproteins and show some discrimination betweenglycoproteins. Labeling agents of the invention generally comprise aprotein binding moiety and a marking moiety, wherein the protein bindingmoiety is capable of interacting covalently or non-covalently with abroad range of cell surface proteins, and wherein the marking moiety isuseful in identifying proteins associated with the labeling agent. Theprotein binding moiety and marking moiety may, in certain instances, bepresent in a single, multifunctional moiety.

In certain embodiments, the protein binding moiety forms one or morecovalent bonds with proteins, often by reacting with, for example, α- orε-amine, thiols and glycans. Examples of such protein binding moietiesare known in the art and, in view of this specification, one of skill inthe art would be able to select an appropriate moiety for incorporationinto a labeling agent. In general there are three major classes ofmoieties that form covalent bonds with amines: succinimidyl esters (eg.N-hydroxysuccinimide, or NHS) and including sulfosuccinimidyl esters,isothiocyanates, and sulfonyl chlorides. Other amine-reactive moietiesinclude, but are not limited to, dichlorotriazines, aryl halides andacyl azides. Thiol reactive moieties include, but are not limited to,haloalkyls (eg. iodoacetamides), maleimides, and bimanes (eg.monobromotrimethylammoniobimane, p-sulfobenzoyloxybromobimane). Ingeneral, thiol-reactive moieties show preference for interaction withcysteine residues, with lesser interaction with methionines. Maleimideshave higher selectivity for cysteine over methionine than do thehaloalkyls.

In further embodiments, the protein binding moiety binds non-covalentlyto a broad range of proteins. For example, lectins are a class ofproteins that bind to glycoproteins through the interaction with one ormore sugar subunits. Because glycoproteins share many of the sameoligosaccharide modifications, lectins tend to bind to a broad array ofproteins and are thus suitable as relatively non-specific labelingagents. Exemplary lectins include, but are not limited to, concanavalinA, phytohemagglutinin, isolectin GS-IB4 from Griffonia simplicifolia,lectin HPA from Helix pomatia, lectin SBA from Glycine max, lectin PNAfrom Arachis hypogaea, lectin GS-II from Griffonia simplicifolia, etc.

In certain embodiments, marking moieties are members of a specificbinding pair, meaning that the marking moiety interacts specificallywith a binding partner. As an illustrative example, biotin andstreptavidin form a specific binding pair. It is preferable that thespecific binding pair interact with a dissociation constant (K_(D)) ofless than about 10⁻⁶, and, more preferably, less than about 10⁻⁹. Otherexemplary specific binding pairs include, but are not limited to, metals(including partially liganded metals) and metal binding agents (eg.nickel and polyhistidine, divalent cations and EDTA, iron andhemoglobin, etc.), chitin and chitin binding protein, cellulose andcellulose binding protein, glutathione and glutathione-S-transferase, anantibody-antigen pair, a magnetic metal and a magnet, etc. Anotherexemplary specific binding pair is PHB (or modifications thereof thatretain the ability to interact with SHA) and SHA (or modificationsthereof that retain the ability to interact with PHB), which form acovalent bond under relatively mild conditions, the resultant covalentcomplex being stable even when exposed to strong chaotropic or proteindenaturing agent.

In further embodiments, marking reagents provide a novel functionalgroup that may be reacted with additional labeling agents at a latertime. In certain exemplary embodiments, a marking reagent provides athiol group that can react with a second labeling agent that is thiolreactive. Accordingly, in one aspect, membrane surface proteins may becontacted with a first labeling agent that comprises an amine-reactiveprotein binding moiety and a marking reagent that has a disulfide bond.The labeling reagent attaches to exposed amines. Subsequently, thedisulfide bond may be reduced, to yield an exposed thiol. The proteinsmay then contacted with a second labeling agent that has a desiredmarking moiety and a thiol-reactive protein binding moiety. In a sense,the method permits the conversion of amines into thiols so that alabeling agent containing a thiol-reactive moiety can be used to labelproteins at positions normally having amines. This procedure isadvantageous in part because it greatly increases the utility oflabeling agents having thiol-reactive moieties. Proteins generally havefar more free amines than free thiols, and accordingly thiol-reactivelabeling agents tend to label fewer proteins and have a weaker signalper protein. By providing thiol groups at positions that normally haveamines, it is possible to achieve stronger and more general labelingwith thiol reactive groups.

In certain embodiments, a labeling agent comprises a marking moiety thatcomprises a phenylboronic acid (or modifications thereof that retain theability to interact with SHA) or a salicylhydroxamic acid (ormodifications thereof that retain the ability to interact with PHB). Amarking moiety comprising a PHB may be captured by an agent comprisingan SHA. The agent comprising the SHA may include essentially any usefuladditional element, such as a fluorescent label, a member of a specificbinding pair, etc. Likewise, a marking moiety comprising an SHA may becaptured by an agent comprising a PHB. The agent comprising the PHB mayinclude essentially any useful additional element, such as a fluorescentlabel, a member of a specific binding pair, etc. PBA and SHA react toform a strong complex in moderate conditions and in abiologically-compatible buffer environment. The link formed between aPHB and an SHA is resistant to dissolution, and proteins labeled withsuch a complex may be subjected to treatment with chaotropic agents thatare useful, for example, for membrane solubilization. Such chaotropicagents are harmful to many labeling systems, such as a biotin/avidinsystem. Labeling agents of this type may include, as a binding moiety, areactive group that is, for example, reactive with a sugar group, anamine and/or a sulfhydryl. Exemplary binding moieties includeN-hydroxy-Succinamide (“NHS”) or hydrazide. The hydrazide moiety isuseful, for example, for relatively non-specific tagging ofglycoproteins. The amount of tagging will depend on the amount ofoxidation on the glycan and may be controlled by gradual oxidation ofthe glycans. Gradual tagging may be used, for example, to tag proteinswith two types of labeling agents, such as a first labeling agent thatis useful for direct detection of the labeled proteins and a secondlabeling agent that is useful for affinity capture of the labeledproteins. The amount of oxidation on a glycoprotein may be controlledby, for example, manipulating the concentration of an oxidant such asNaIO₄, manipulating the time of exposure to an oxidant and thetemperature. Alternatively, the tagging can be performed after enzymaticoxidation. Exemplary labeling agents of the PBA or SHA types includethose presented in FIGS. 3-8. Certain exemplary labeling agents of thesetypes are available from Prolinx, Inc. (Bothell, Wash.). Exemplarylabeling agents of these types, and methods for preparation aredescribed in U.S. Pat. No. 5,777,148.

In certain embodiments, the invention provides novel labeling agentsbased on the PBA and SHA structures described herein, as well asadditional reagents that are specifically suitable for performingcertain methods of the invention. In one embodiment, the inventionprovides PBA or SHA-based labeling agents comprising a disulfide bondpositioned within an aliphatic chain. The disulfide bond enables removala substantial portion of the reagent from the tagged protein undergentle reducing conditions. The amount of labeling agent removed dependson the position of the disulfide bond within the aliphatic chain. Forexample, the disulfide bond may be positioned to leave a tag ofapproximately 89 Da or smaller. Furthermore, the disulfide groupdecreases the membrane permeability of the labeling agent andfacilitates many further manipulations such as detection with massspectroscopy, resolution by gel electrophoresis, etc. In a furtherexemplary class of novel molecules, a disulfide bond is incorporatedinto the aliphatic carbon chain of PBA- or SHA-based molecule comprisinghydrazide as its binding moiety. This agent, as noted above, is useful,for example, for relatively non-specific tagging of glycoproteins andmay be used in gradual labeling and multiple labeling protocols.Labeling agents of this type are advantageous, in part, because thedisulfide bond may be reversed to leave only a minimal group on thelabeled proteins. The ability to remove a substantial portion of thelabeling agent may, in certain embodiments, facilitate proteinseparation and/or identification. This labeling agent is not abiomolecule and therefore it has a reduced tendency to interactnon-specifically with other proteins. In a further aspect, the inventionprovides PBA- and SHA-based labeling agents comprising one or moreadditional hydrophilic moieties. In general, the inventive labelingagents comprise sufficient hydrophilic moieties to be substantiallymembrane impermeable. Exemplary hydophilic moieties includepolyethyleneglycols and charged groups such as sulfonates. In certainexemplary embodiments, a hydrophilic moiety is bonded to the NHS activeester portion of a labeling agent. Substantially membrane impermeablelabeling agents comprising a PBA or SHA type group may, depending on theembodiment, have a number of previously unappreciated advantages. Forexample, in some aspects the use of the method of the invention reducesnon-specific interactions caused by endogenous biological molecules suchas biotin. Since this technology is chemically based it is free of thelimitation of denaturation of the avidin/biotin complex and therefore itis possible to work with strong chaotropic agents and other denaturatingsolubilization techniques. This enables a specific and improved taggingof membrane proteins. In certain aspect, the use of these labelingagents and methods of the invention also enable solubilizing taggedmembrane proteins with strong buffers such as urea, thiourea anddetergents.

The interaction between the marking moiety and a specific bindingpartner can be used in a variety of ways to identify those proteins thatare labeled with the labeling agent. For example, labeled proteins maybe separated from unlabeled proteins by affinity purification using thespecific binding partner. The specific binding partner would typicallybe affixed to a solid, semi-solid or insoluble substrate (most commonlya polymeric substance formed into small beads) and exposed to themixture of labeled and unlabeled proteins. Labeled proteins will tightlyassociate with the substrate through the interaction between the affixedbinding partner and the marking moiety of the labeling agent. In view ofthis specification, many variations on the general methods of separationusing the binding partner are known to those of skill in the art.

In another example, the specific binding partner may be modified with adetectable reagent (eg. fluorescent, radioactive, colored) and thenexposed to a mixture of labeled and unlabeled proteins. Those proteinsthat are bound to a labeling agent will bind to the detectable bindingpartner and can then be detected. Labeled and unlabeled proteins mayalso be separated (for example by gel electrophoresis or chromatography)and then detected using the specific binding partner.

In a further embodiment, labeled membrane surface proteins may beaffixed to a solid surface to form an array of the labeled proteins. Forthis embodiment, the solid surface is prepared by affixing an agent thatbinds to a marking moiety to be introduced onto the labeled membranesurface proteins. The membrane surface proteins are selectively labeledand contacted with the prepared surface, thereby becoming bound to thesolid surface to make an array of labeled membrane surface proteins. Forexample, if the labeling agent comprises an SHA-type marking moiety,then the solid surface is prepared with a PBA-type moiety. As anotherexample, if the labeling agent comprises a disulfide bond to be reducedso as to reveal a free sulfhydryl, then the solid surface may beprepared with a sulfhydryl reactive reagent. A solid surface may be aMALDI-TOF MS target (the solid support of the samples to be tested inthe instrument). Such MALDI targets can be the Ciphergen (Freemont,Calif.) instrument or other MALDI-TOF instruments. After attachment tothe solid surface, the proteins may be washed with a buffer, such asammonum bicarbonate 25 mM pH=8.5, to reduce non-specific binding and toequilibrate the pH to between 7 and 9, and optionally approximately 8.5.If desired, proteins in the array may be analyzed by mass spectrometry.For example, the proteins may be digested with a protease such astrypsin. The next step is adding MALDI matrix like alpha-cyano (orequivalent) and analyzing the mixture of peptides using the MALDI-TOF orMALDI-TOF/TOF instruments.

In yet another embodiment, the marking moiety is fluorescent. In certainembodiments, a fluorescent marking moiety is substantially membraneimpermeable, and optionally the membrane permeability is decreased bymodifying the fluorescent moiety with one or more hydrophilic elements,such as polyethylene glycols and/or charged groups such as sulfonates.Exemplary fluorescent moieties, presented here with no intent to becomprehensive or limiting, include fluoresceins, benzoxadioazoles,coumarins, eosins, Lucifer Yellow, pyridyloxazoles, flavins,peridinin-chlorophyll a, phycoerythrins, phycocyanins, and rhodamines.These and many other exemplary fluorescent moieties may be found in theHandbook of Fluorescent Probes and Research Chemicals (2000, MolecularProbes, Inc.). Exemplary fluorescent coumarins are shown below. Incertain embodiments, a method of the invention employs a labelingreagent comprising an SHA-type group as a marking moiety, reacting thelabeling agent with a closed membrane structure and then reacting thelabeled proteins (having the SHA-type group attached) with a PBA-typegroup that is attached to a fluorescent moiety, such as a fluorescentcoumarin. In additional embodiments, the PBA-type group may be part ofthe labeling agent and the SHA attached to a fluorescent moiety. As willbe appreciated by one of skill in the art, the fluorescent coumarins arepresented coupled to an exemplary amine-reactive protein binding moiety.It is understood that any of a variety of protein binding moieties maybe substituted. In preferred embodiments, the protein binding moiety isa succinimidyl ester that has been modified to increase thehydrophilicity of the labeling agent, optionally by adding a sulfonate.The fluorescent coumarins below are numbered according to the optimalexcitation wavelength and are commercially available from MolecularProbes, Inc. under the name Alexa Fluor®.

Certain preferred labeling agents include NHS-SS-biotin (EZ-Link™NHS-SS-Biotin, Cat. No. 21331, Pierce, Rockford, Ill.), wherein thebiotin or the disulfide bond may be considered the marking moiety andNHS is the protein binding moiety, and/or any of the above fluorescentcoumarins shown above, such as coumarin 488 carboxylic acid,succinimidyl ester, dilithium salt, available through Molecular Probes,Inc. as Alexa Fluor®488 (Cat. No. A-10235, Molecular Probes, Eugene,Oreg.), wherein the fluorescent coumarin is the marking moiety and thesuccinimidyl ester is the protein binding moiety.

Another exemplary labeling agent is an Isotope Coded Tag (ICT). An ICTcomprises a marking moiety that may carry one or more stable isotopes,preferably deuterium. Another variant is an Isotope Coded Affinity Tag(ICAT), which additionally comprises a marker for affinity purification,such as biotin. Exemplary ICAT labeling agents may be found in Aebersoldet al. (Nature Biotechnology (1999) 17: 994-999).

Methods for labeling membrane surface proteins generally comprisecontacting closed membrane structures with a labeling agent forsufficient time to allow stable interactions to form between thelabeling agent and membrane surface proteins. In many embodiments, cellsare contacted with labeling agent for sufficient time, lysed and thelabeled proteins are analyzed. In certain embodiments, the markingmoiety is suitable for affinity purification, and labeled proteins maybe separated from unlabeled proteins by affinity purification.

In other embodiments, the marking moiety is not suitable for affinitypurification but is easily detectable, for example a fluorescent markingmoiety. In such cases, cell surface proteins are enriched through any ofvarious methods for enriching membranes. Such methods are, in view ofthis specification, generally known to one of skill in the art.Typically, membranes and the associated proteins are enriched by aseparation method that takes advantage of the difference in densitybetween membranes and other cellular components. For example, gradientcentrifugation will yield a fraction of membrane material largelyseparated from other, non-membrane-associated, cellular components.Other separation methods may take advantage of the poor solubility ofmembranes in aqueous solutions. For example, insoluble membranes may beseparated from soluble components by high-speed centrifugation.Membranes isolated in this fashion will comprise both labeled andunlabeled proteins, but the detectable marking moiety permits theidentification of those proteins that are labeled with the labelingagent.

Cells to be labeled may be cultured cells as well as cells obtained froma subject. In preferred embodiments, cells are eukaryotic cells withintact membranes, and in some embodiments, the cells are viable.Preferably, cells are stripped of extracellular matrix prior to labelingso as to tag only those proteins that remain associated with themembrane after removal of the extracellular matrix. An exemplaryprocedure for removing extracellular matrix from adherent cultured cellscomprises detaching cells using a physiological salt buffer (eg.phosphate buffered saline—“PBS”) and a divalent ion chelator (eg. EDTA)solution. The chelating agent causes depolymerization of extracellularmatrix proteins, which are subsequently washed away by one or more saltbuffer washes. Thus, only proteins that are associated with the cellsurface will remain and be labeled in subsequent steps. Similar methodsmay be employed to remove the extracellular matrix from cells obtainedfrom a subject.

In certain preferred embodiments, the labeling reaction is performed ata temperature cold enough to minimize membrane protein turnover.Preferred temperatures range from about 1 degree C. to 10 degrees C.,and most preferably the temperature is about 4 degrees C. An exemplarybuffer for labeling with a succinimide-based protein binding moiety isPBS/CM (PBS with 1.3 mM CaCl₂, 1 mM MgCl₂). The binding reaction betweenthe labeling agent and the proteins must often be quenched. For example,a labeling agent that covalently binds to amines can be quenched with acompound containing amines, eg. glycine or Tris. Quenching may also beaccomplished by lowering the pH by, for example, adding ammoniumchloride, or by a combination of pH lowering and the addition of primaryamines. Quenching is typically followed by a wash in a physiologicalsalt buffer and then transfer into a solubilization buffer.Solubilization buffers typically comprise buffering agents at pH 6-8,divalent cations, salts and a non-ionic detergent. Exemplary detergentsinclude Triton X-100 and, most preferably ASB-14. An exemplarysolubilization buffer contains 50 mM Tris-HCl, pH7.6, 150 mM NaCl, 10%glycerol, 2% ASB14, 5 mM EDTA, 1 mM EGTA, 1.5 mM MgCl2, and proteaseinhibitors. Solubilization is usually carried out at a cool temperatureto minimize damage to the proteins. After solubilization, the labeledproteins are separated from the unlabeled proteins.

In certain embodiments, biotin is used as the marking moiety of thelabeling agent. The resultant labeled proteins are thereforebiotinylated. Such proteins are preferably affinity purified bycontacting them with a biotin-binding substrate such as avidin-sepharosebeads. After the binding reaction, unbound proteins are removed bywashing. Suitable wash buffers are, in view of this specification, knownto those of skill in the art. An exemplary buffer comprises 20 mMTris-HCl, pH7.6, 300 mM NaCl, 10% glycerol, 0.1% Triton X-100, 0.1% SDS,1 mM EDTA, 1 mM EGTA, 1.5 mM MgCl2, and protease inhibitors. Release ofbiotinylated proteins from avidin beads can be difficult. One method isto use a reversible connection between the protein binding moiety andthe biotin. Preferred reversible connections are disulfide bonds whichmay be broken by reduction with an appropriate reducing agent.Application of the reducing reagent may result in a dramatic reductionin pH which, depending on the downstream use for the proteinpreparation, may be undesirable. Preferably, the reduction isaccomplished using TCEP-HCl (Cat. No. 580560, Calbiochem), due to itssuperior stability and effectiveness over a wide pH range (1.5-8.5). Thereducing solution is then strongly buffered, for example by addition ofgreater than 25 mM Tris base, and most preferably by addition ofapproximately 50 mM Tris base. Most preferred reducing solutions willhave sufficient buffering material added to have a pH in the range of6.5 to 8, most preferably about 7.5. An exemplary reducing solutioncomprises 50 mM Tris base, 20 mM TCEP-HCl, 20 mM NaOH and mostpreferably further includes a cocktail of protease inhibitors and/orroughly 150 mM NaCl. Reduction may be performed at essentially anytemperature that is favorable for recovery of protein, and in certainembodiments, reduction is performed at a temperature ranging from 20 to30 degrees C., preferably at room temperature. After incubation, labeledproteins substantially free of unlabeled proteins are available forfurther analysis. This method may result in the production of freethiols that, as described above, can be used to label proteins with asecond labeling agent that reacts with thiols. This procedure may,depending on the method in which it is employed, provide a number ofadvantages. For example, in the detection of relatively low abundancemembrane proteins, after enrichment of membrane proteins according tothe biotin/avidin method described above, the free thiols may be reactedwith a radioactive labeling agent. These labeled membrane proteins canthen be identified using gel electrophoresis and the time of exposure toa system for detecting radiation (such as a film or a Phosphoimager,available from Amersham Biosciences) such that low abundance proteinscan be detected. Alternatively a fluorescent label may be used for thesecond labeling to allow detection by fluorescent systems. In addition,if distinguishable fluorescent labels are used for differentpreparations of labeled membrane surface proteins, then a differentialdisplay of membrane proteins can be achieved. Fluorescent detectionsystems may be coupled with chromatography as well as gels.

As described above, a variety of labeling agents may be designed toincorporate a reversible bond such as a disulfide bond. The reductionprotocol described above for use with a biotin/avidin system may also beused with other labeling agents comprising a disulfide bond, and thereducing conditions described therein may be used to generate the freethiols regardless of the moiety attached thereto. For example, thereducing conditions may be used to generate free thiols from any of thelabeling agents comprising a PBA- or SHA-type group and a disulfide.

In a further embodiment, a lectin is used as the protein-binding moiety.Lectins may easily be modified with any appropriate fluorescent markingmoiety.

In an additional embodiment, the labeling agent comprises a fluorescentcoumarin as the marking agent and succinimidyl ester as the proteinbinding moiety. The succinimidyl ester binds covalently to primaryamines and the steps of the labeling process are essentially asdescribed above. However, fluorescent coumarins are not easily used foraffinity purification. Accordingly, the labeled cell surface proteinsmay be substantially enriched by using membrane enrichment methodsdescribed above. Alternatively, labeled proteins may be directlyresolved, for example by two-dimensional (2D) electrophoresis, and cellsurface proteins are distinguished from other proteins by thefluorescent label.

The invention provides for differential display methods that permitdirect comparison of two or more samples. In general, a first sample isreacted with a first labeling agent and a second sample is reacted witha second labeling agent. The first and second labeling agent aretypically identical except for having a detectably different markingmoiety. In a preferred embodiment, two samples are treated with lectinsmodified with different fluorophores. The samples are then mixed andresolved. In certain embodiments, resolution is accomplished byelectrophoresis and preferably two-dimensional electrophoresis. All ofthe proteins migrate to the appropriate position on the gel, and thecomparative amount of each protein in each sample is measured by readingthe respective fluorescence signals. In an alternative embodiment,fluorescent non-lectin labeling agents are used. In yet anotherembodiment, the labeling agents comprise ICT or ICAT moieties. Forexample, the first sample is labeled with a “light” or non-deuteratedICT or ICAT, while the second sample is labeled with a “heavy” ordeuterated ICT or ICAT. The differentially labeled samples are mixed andsubjected to mass spectrometry. MS is capable of distinguishing the“heavy” labeled proteins from the “light” labeled proteins and providesa direct comparison of the amount of each labeled protein present ineach sample. As discussed above, many different fluorophores, andpotentially many distinguishable ICT or ICAT moieties are available, andit is anticipated that the methods described herein may be used withmore than two samples, so long as each sample is labeled with adistinguishable marker. For example, three samples may be differentiallylabeled with red, green and blue fluorescing moieties, mixed andanalyzed to provide a differential display of the relative membranesurface protein abundance in each sample.

4. Methods of Processing and Identifying Membrane Proteins

Having obtained an enriched preparation of membrane surface proteins, itis generally desirable to identify and characterize the proteinspresent. In certain embodiments, methods of the invention include theidentification of proteins present in the preparation. In preferredembodiments, a plurality of proteins are identified, and it isparticularly preferable to identify more than 10, more than 20, 25, 30,50, 100 or 1000 proteins in a preparation. It is also desirable tocharacterize other aspects of each protein, such as abundance, thepresence of one or more post-translational modifications, and membranetopology.

In certain aspects, the invention provides methods of generatingprofiles of membrane surface proteins. In general, a profile of membranesurface proteins is obtained by combining a step of selectively labelingmembrane surface proteins with a step of identifying labeled proteins.Preferred profiles will include information about the identity andamount of membrane surface proteins present in a sample. Profiles may begenerated for a number of different samples, possibly representing arange of tissue types and clinical states. A profile may be comparedagainst one or more other profiles. Such comparisons may be useful forindicating changes in protein levels, modifications, etc. In addition,such comparisons may be used to characterize a sample. For example, aprofile from a possible cancer sample may be compared against a range ofcancerous and non-cancerous profiles to determine whether the sampledmaterial is indeed cancerous.

In view of this specification, many techniques for identifying and/orcharacterizing proteins of the subject preparations are available to oneof skill in the art. Certain methodologies require preparative steps,such as, for example, resolution of the complex mixture of proteins intosimpler mixtures or substantially pure proteins. Other methodologies maybe used without such preparative steps. While not intended to belimiting, several preferred methods of analysis are presented herein.Such methods may be combined in various ways that, in view of thisspecification, will be appreciated by one of skill in the art.

Gel Electrophoresis

Gel electrophoresis of proteins is a common methodology that may providemany different forms of information, including protein size, isoelectricpoint and abundance. In addition, gel electrophoresis is a powerfulmethod for the resolution of complex protein mixtures into bands orspots of reduced complexity. One dimensional electrophoretic methodsinclude, but are not limited to, one-dimensional SDS-PAGE, isoelectricfocusing, one-dimensional non-denaturing gel electrophoresis and 2D gelelectrophoresis. 2D gel electrophoresis involves one dimension ofisoelectric focusing and another dimension of SDS-PAGE. Proteinsresolved by gel electrophoresis may be used for further analysis, ifdesired. Proteins may be eluted or otherwise obtained from the gel by avariety of methods including, for example by cutting the appropriateportion of the gel, optionally followed by electroelution, oralternatively by electroblotting onto a membrane such as nitrocelluloseor polyvinylfluoride. Proteins so processed may then be used in avariety of analytic methods, including but not limited to, antibodyanalysis (eg. Western blot, ELISA, protein array), Edman degradation,mass spectrometry, etc.

Chromatography

Proteins may be resolved into simpler mixtures or to substantial purityby a variety of chromatography methods known in the art. Suchchromatography methods may include, for example, anion exchange, cationexchange, hydrophobic interaction, reverse phase, size exclusion,hydroxylapatite etc. In addition, a variety of affinity chromatographymethods may be employed, depending on the particular proteins ofinterest. Chromatography methods may be employed in series or performedrepeatedly to obtain higher degrees of resolution. Chromatography is notonly a preparative tool. Many types of chromatography provideinformation about the proteins. For example, size exclusionchromatography can be used to obtain molecular weights of nativeproteins in both reducing and non-reducing conditions. Ion exchangecolumns provide information regarding the pI of the subject proteins.

Mass Spectrometry

With the extensive availability of protein sequence information, massspectrometry (MS) may be employed for rapid identification of proteinspresent in cell surface protein preparations. Mass spectrometry may alsobe useful for determining post-translational modifications and membranetopology.

Sample Preparation for Mass Spectrometry

If proteins are first resolved by gel electrophoresis, certainpreparative steps are preferred. In order to facilitate theidentification of proteins by MS, bands containing one or more proteinspecies are excised from the gel, digested into polypeptides bytreatment in situ with a protease such as trypsin, and transferred intosolutions and concentrations compatible with MS analysis. Techniques forthe in-gel processing of proteins have been refined into standardizedprotocols. The so-called “in-gel digestion” approach has been developedfor the enzymatic fragmentation of proteins embedded in gel pieces, andthe extraction of the resulting peptides (Wilm et al. (1996) Nature 379:466-9). Sequencing-grade modified trypsin has been the enzyme of choicefor high-throughput identification of proteins. In one exemplary method,a band of interest is excised from the gel, and subjected to reductionand alkylation to break the cysteine bridges and prevent them fromreforming. After equilibration with the corresponding buffer the gelpieces are swelled in a solution of trypsin, allowing the enzyme toenter into the gel. The digestion is allowed to proceed at 37° C.,generally overnight. The resulting peptides are extracted and preparedfor MS analysis.

Mass Spectrometers for Protein Identification

Typically, a mass spectrometer consists of at least three components: anionization device, a mass separator, and a detector. Mass spectrometryis a very powerful separation technique for separating and identifyingmolecules that are charged in the gas phase. Mass spectrometers aregenerally only able to separate either positively or negatively chargedanalytes at a time. The term ionization is misleading, because most massspectrometers do not perform the ionization of molecules per se.Instead, the term ionization relates to the transfer to gas phase ofanalytes, while maintaining their charge, and/or acquiring a charge fromthe sample environment, typically in the form of proton. The study ofpeptides and proteins is predominantly dominated by two sampleionization techniques: matrix-assisted laser desorption ionization(MALDI) (Aebersold et al. (1993) Curr Opin Biotechnol 4: 412-9; Arnottet al. (1993) Clin Chem 39: 2005-10; Hillenkamp et al. (1991) Anal Chem63: 1193A-1203A). and electrospray ionization (ESI) (Fenn et al. (1990)Mass Spectrometry Reviews 9: 37).

MALDI Mass Spectrometers, Peptides and Proteins Analysis

MALDI ionization is a technique in which samples of interest, in thiscase peptides and proteins, are co-crystallized with an acidified matrix(Nelson et al. (1994) Rapid Commun Mass Spectrom 8: 627-31.). The matrixis a small molecule, which absorbs at a specific wavelength, generallyin the ultraviolet (UV) range and dissipates the absorbed energythermally. Typically, a pulse laser beam is used to rapidly (few ns)transfer energy to the matrix. This rapid transfer of energy causes thematrix to rapidly dissociate from the surface generating a plume ofmatrix and the co-crystallized analytes into the gas phase. It is notclear if the analytes acquire their charge during the desorption processor after entering the gas plume of molecules by interacting with thematrix molecules. However, the end result is a small pocket of chargedanalytes that are present in the gas phase. To date, MALDI has beenpredominantly coupled in-line with time of flight (TOF) massspectrometers. The function of a time of flight mass spectrometer is tomeasure the time that analytes take to flight across a fixed path length(the TOF tube or chamber). The charged analytes present in the plume aretherefore transferred to the TOF tube after an appropriate time delay.In order to move the analytes into the TOF tube, a high voltage isapplied to the MALDI plate generating a strong electric field betweenthe plates and the entrance of the TOF chamber. Smaller analytes willreach the entrance of the chamber more rapidly than larger analytes(i.e. constant kinetic energy applied, generating different velocity forthe analytes). Once in flight, the analytes are in a field-free regionand separate along the tube while moving toward the detector. Again,analytes of lesser mass move along the tube faster and reach thedetector prior to analytes of greater mass. The detector is in tune withthe laser shots and time delay, and measures the peptide and proteinions as they arrive over time. When the mass range is calibrated byusing standards of known mass and charge, the time of flight for a givenion can be converted to masses. The end result is a spectrum comparingobserved intensity versus ion (protein or polypeptide) mass.

MALDI-TOF MS is easily performed with modern mass spectrometers.Typically the samples of interest, in this case peptides or proteins,are mixed with a matrix mixture and successively spotted onto a polishedstainless steel plate (MALDI plate). Commercially available MALDI platescan hold 96 samples per plate. The MALDI plate is then installed intothe vacuum chamber of a MALDI mass spectrometer. The pulsed laser isthen activated and the time of flight acquisition triggered aspreviously described. An MS spectrum containing the masses mass tocharge ratio of the peptides/proteins is then generated. The charge ofmolecules ionized by MALDI is typically 1.

Recently, the MALDI ion source technology has also been coupled with ahybrid orthogonal mass spectrometer. In this design the MALDI ionizationapproach is, but for minor modifications, essentially as describedabove. However, the TOF detector is replaced with an orthogonal massspectrometer (e.g. Q-Star by PE-Sciex), which consists of a quadrupolefollowed by a collision cell and a pulsed perpendicular TOF MS. Thehybrid instrument (MALDI-Q-Star) has the advantages of high resolutionmapping of the peptide masses contained in a peptide mixture, and theoption of efficient fragmentation of selected peptides by collisioninduced dissociation. These fragmentation patterns contain informationrelated to the amino acid sequence of the peptides.

ESI Mass Spectrometers, Peptides and Protein Analysis

Electrospray ionization is also widely utilized to introduce protein andpeptide mixtures to mass spectrometers. Electrospray ionization (ESI)allows the transfer of analytes from a liquid phase to the gas phase atatmospheric pressure. The ionization process is achieved by applying anelectric field between the tip of a small tube and the entrance of amass spectrometer. The electric field induces the charged liquid at theend of the tip to form a cone, called a Taylor cone that minimizes thecharge/surface ratio. Droplets are liberated from the end of the cone,and travel towards the mass spectrometer entrance. The liberateddroplets go through a repetitive process of solvent evaporation from thedroplets and fragmentation of the droplets into smaller droplets. Thisprocess leads to a large number of droplets of vanishing size until thesolvent has disappeared and the charged analytes are in the gas phase.Moreover, while the droplets are shrinking, the pH decreases causingprotonation of the analytes. Therefore, it is common to obtain multiplycharged analytes by ESI when dealing with trypsinized proteins.

Typically, electrospray ionization is used in conjunction with triplequadrupole, ion tap, or hybrid quadrupole-time-of-flight massspectrometers (Patterson et al. (1995) Electrophoresis 16: 1791-814).Electrospray ionization has significant advantage over MALDI in terms ofease of coupling to separation techniques such as HPLC, LC and CE. ESIcan also be used for the continuous infusion of samples. Furthermore,the tendency to provide multiply charged peptides from tryptic digests,in conjunction with collision-induced dissociation allows the generationof enhanced MS/MS spectra over what has been achieved with eitherconventional MALDI-TOF, or the hybrid MALDI-Q-Star instrument.

Electrospray ionization and the MALDI-Q-Star instruments both rely oncollision-induced dissociation to generate fragmentation patterns (MS/MSspectra) related to a selected peptide amino acid sequence. Typicallythe generation of MS/MS spectra requires two independent experiments. Inthe first pass, a mixture of peptides (a tryptic digest) are separatedaccording to mass-to-charge (m/z) ratio by the mass spectrometer and alist of the most intense peptide peaks is established. In the secondpass, the instrument is adjusted such that only a specific m/z species(identified during the first-pass analysis), presumably a unique peptideion, is allowed to enter the mass spectrometer. These ions are directedinto a collision cell and their kinetic energy is increased. In thecollision cell the ions collide with inert gas molecules with sufficientkinetic energy to break peptide bonds. This process is termedcollision-induced dissociation, CID, and generates both charged andneutral fragments derived from the same ‘parent’ ion. Finally, the newlygenerated charged fragments are separated by the mass spectrometeraccording to their m/z creating the MS/MS spectrum. By application ofappropriate collision energy, the fragmentation occurs predominantly atthe peptide bonds and a ladder of fragments is generated. The differencein mass between certain peaks corresponds to the loss of a single aminoacid. The sequence of the peptide can then be reconstituted by aladder-walk done by measuring the mass difference between successivemasses for specific types of ions (i.e. y or b series ions).

The peptide masses are typically accurately measured using a MALDI-TOFor a MALDI-Q-Star mass spectrometer down to the low ppm (parts permillion) precision level. The ensemble of the peptide masses observed ina tryptic digests can be used to search protein/DNA databases in amethod often called peptide mass fingerprinting (Clauser et al. (1995)Proc Natl Acad Sci USA 92: 5072-6; Cottrell (1994) Pept Res 7: 115-124;Pappin (1997) Methods Mol Biol 64: 165-73). In this approach proteinentries in the databases are ranked according to the number of peptidemasses that match to their predicted trypsin digestion pattern.Commercially available software provides a scoring scheme based on thesize of the databases, the number of matching peptides, and thedifferent peptides. Depending on the number of peptides observed, theaccuracy of the measurement, and the size of the genome of theparticular species, unambiguous identification can be obtained.

MS/MS spectra are a second set of information that can be used toidentify a protein. The MS/MS spectra contain the fragmentation patternrelated to the amino acid sequence of specific peptides. The analysis ofMS/MS spectra is typically more intensive. The approaches that are inused for the interpretation of these spectra can be classified intothree subgroups according to the level of user intervention required.

In the first subgroup no interpretation of the spectra is required. Theinformation contained in the spectra is directly correlated withprotein/DNA sequence information contained in databases. Differentalgorithms have been developed for this specific task. These algorithmsautomatically search uninterpreted MS/MS spectra against protein and DNAdatabases and some are freely available (for non-commercial entities)and can be accessed over the Web. Mascot by Matrix Sciences(www.matrixscience.com), and ProteinProspector from UCSF(http://prospector.ucsf.edu) are the most commonly used web-based MS/MSsearch engines. The identification of the protein is typicallyunambiguous through the number of peptides that matches to the sameprotein. Another algorithm that is popular is “Sequest” (Eng et al.(1994) J. Am. Soc. Mass Spectrom. 5: 976-989; Yates et al. (1995) AnalChem 67: 1426-36; Yates et al. (1998) Peptide sequencing by tandem massspectrometry, p. 529-538, Cell Biology: A Laboratory Handbook, vol. 4.Academic Press, San Diego). For every MS/MS spectra submitted thisalgorithm searches protein/DNA databases for the top 500 isobaricpeptides and the corresponding predicted spectra are generated. Thepredicted spectra are rapidly matched against the measured spectra bymultiplication in the frequency domain using a fast-Fouriertransformation. Correlation parameters, which indicate the quality ofthe match between predicted and measured spectra, are then deduced. Ahigh cross-correlation indicates a good match with the measuredspectrum. Although protein identification has been performed with aslittle as one peptide using this algorithm, unambiguous identificationof the provenance of a protein is often achieved by the multitude ofpeptides that matches to the same entry in a database. The Sequestalgorithm is computing intensive, and for high-throughput demand canrapidly paralyze a dual-CPU server. The slow nature of Sequest is due toits attempt to find the best matching 500 isobaric peptides. The largerthe database being repeatedly scanned to compile this list, the longerthis function takes. An improved version of the software, calledTurbo-Sequest, predigests and orders the databases resulting in greatlyimproved searching times.

The approaches in the second subgroup all involve the partialinterpretation of the MS/MS spectra, and therefore require humanintervention. The dominant approach, often called “sequence-tag” (Mannet al. (1994) Anal Chem 66: 4390-9; Patterson et al. (1996)Electrophoresis 17: 877-91; Wilkins et al. (1996) Biochem Biophys ResCommun 221: 609-13), consists of reading the mass spacing between a fewspecific fragments in a MS/MS spectrum and to generate a short section(tag) of the peptide sequence. Using this tag and the residual massinformation, the provenance of the peptide can be ascertained bycomparison with sequence and calculated masses obtained from proteindatabases for isobaric peptides. Every MS/MS spectrum requires thegeneration of a tag followed by database searching. Unambiguousidentification of the protein is established by the multitude ofpeptides that match to the same protein. Over the years, differentvariations on this theme have been developed to perform databasesearching using sequence tags. The main limitation of the “sequence-tag”approach in large-scale proteomics efforts is the labor and expertiserequired to manually generate the required partial interpretations ofthe MS/MS spectra. Attempts to automate the generation of sequence tagsare underway to solve this problem.

The last sub-group, called de novo sequencing of proteins (Shevchenko etal. (1997) Rapid Commun Mass Spectrom 11: 1015-24; Papayannopoulos etal. (1995) Mass Spect. Rev. 14: 49-73), is often used as a last resourcewhen no matching information are available in databases and the qualityof the MS/MS spectra is good. The MS/MS spectra of peptides containladder-type information, which, in principle indicates their amino acidsequence. Experienced mass spectrometrists can manually extract thepeptide sequence from the CID spectra (de novo sequencing).

Depending on the quality of the data and the complexity of the speciesunder study, a single confident match between a peptide MS/MS spectrumand a protein sequence entry can be enough to identify a protein, or afamily of proteins. The required sequence coverage for unambiguousidentification increases for homologous proteins, when the peptideidentified is not unique to a protein, when dealing with databases ofpoor fidelity and/or partial coverage, and to access SNP databases.Clearly, every subsequent peptide MS/MS that is matched to the sameprotein further increases the confidence level of the identification.

The end result of each of these MS-based approaches is the delivery ofthe identity of the proteins presented for analysis or the partial aminoacid sequence of novel proteins.

Antibody-Related Methods

Antibodies are powerful tools for protein identification, quantitationand isolation. Following gel electrophoresis, Western blotting methodsmay be performed using one or more antibodies to identify and, ifdesired, quantify a number of different proteins present in apreparation. Enzyme-linked immunosorbent assays (ELISAs) may also beperformed to quantify protein levels in a sample. Parallel ELISAs usinga range of different antibodies may be performed in a high-throughputmethod to rapidly obtain quantitative information about many differentproteins in a sample. Antibodies may also be used as a part of a proteinidentification array (discussed below).

Protein purification can also be achieved using antibodies. For example,antibodies may be conjugated to a matrix and used for immunoaffinitychromatography. Purification can also be achieved byimmunoprecipitation. Typically a protein mixture is contacted with oneor more antibodies, and then the antibody-associated proteins areprecipitated by addition of beads coated with an antibody-specificbinding agent, such as protein A. Antibodies may also be tagged with,for example, a biotin molecule, so that precipitation can be achievedusing a streptavidin matrix.

It is understood that antibodies come in a variety of forms includingsingle chain antibodies, polyclonal, monoclonal, Fab fragments, etc.

Protein Identification Arrays

The identity, abundance and even post-translational modification stateof proteins in a complex mixture can be determined using any of avariety of protein identification arrays (WO 00/04389; WO 00/04382; WO00/04390). In general, a protein identification array is an orderedarray of protein capture agents, wherein each protein capture agent iscapable of binding to a particular protein. Protein capture agents maybe specific to a particular protein or to certain epitopes, includingpost-translational modifications. The interaction of a protein captureagents with the corresponding protein(s) may be sensitive or insensitiveto post-translational modifications. In general, protein capture agentsbind to their binding partners specifically and with a dissociationconstant (K_(D)) less than 10⁻⁶. Protein capture agents will typicallybe a biological molecule such as a polypeptide or a polynucleotide(including standard nucleic acids and artificial nucleic acid analogswith altered bases and/or altered backbones, including peptide nucleicacids, locked nucleic acids, linked nucleic acids, mannitol, hexitol,glucitol etc. nucleic acids). For example, antibodies are highlysuitable protein capture agents.

Protein capture agents may be organized into arrays through a variety ofmethods. In general, arrays can be sorted into three types: (1) arrayswherein the protein capture agents are distributed, typically insolution, in a plurality of wells; (2) arrays wherein the proteincapture agents are affixed to a plurality of positions on a solidsubstrate; (3) arrays wherein the protein capture agents are distributedas discrete spots within a gelatinous or porous substrate. In each case,the array is organized such that the protein(s) expected to bind to eachposition on the array are known. The smaller each position of the array,the greater the number of protein capture agents that can be includedwithin an area Miniaturization is beneficial because it reduces thesample size required to obtain a readable signal, reduces the amount ofeach protein capture agent needed, and permits smaller instruments forthe production and analysis of the arrays. In an example of array type(2), a silicon wafer is coated with a grid of gold and titanium. Anamino-reactive compound (eg. 11,11′-dithiobis(succinimidylundecanoate)is applied to the gold surfaces and then used to immobilize antibodiesspotted onto the array.

The procedure to analyze a complex mixture of proteins is, in general,as follows. A mixture of proteins is applied to the proteinidentification array. If a protein of the mixture can be bound by aprotein capture agent of the array, the protein will localize to thatparticular position on the array. The array is designed such that it isknown which proteins will bind to which positions on the array.Therefore, much as with nucleic acid identification arrays, each proteincan be identified by the position on the array that it binds to.Proteins on the array can be measured by a variety of methods.Generally, the proteins will be labeled prior to application to thearray. Labels may include any of those discussed herein. The amount ofprotein present at each position of the array may be measured bymeasuring the presence of the label.

Protein identification arrays may be comprehensive, encompassing as manyproteins and protein variants as possible, or the array may beselective, representing only a subset of proteins or protein types.

Edman Degradation

Protein identification may be accomplished by any of a variety ofsequencing methods. For example, the most commonly used sequencingmethods include amino-terminal sequencing using the Edman degradationmethod and mass spectroscopy (see above). In general, Edman degradationis useful for obtaining the amino-terminal sequence of a purifiedpolypeptide. Internal sequence of the polypeptide may be obtained byfragmenting the polypeptide (eg. through proteolysis), therebygenerating internal fragments with free amino-termini. Edman degradationis most effective within the 15-30 amino acids most proximal to theamino terminus. With the availability of extensive databases of nucleicacid and protein sequences, it is usually unnecessary to obtain acomplete protein sequence in order to make an unambiguousidentification. One or more fragmentary sequences may be comparedagainst sequence databases to identify matches. Typically 15-20 aminoacids will be sufficient to make an unambiguous identification,particularly when combined with information such as predicted molecularweight and species of origin.

In an exemplary embodiment, a protein is attached to a solid supportsuch as a chemically modified glass disk or a porous polyvinylidenefluoride membrane in the reaction cartridge. It is then coupled tophenylisothiocyanate (PITC) at pH 8 and 45° C. The free N-terminal aminogroup reacts with the carbon of the isothiocyanate group to give thephenylthiocarbamyl (PTC) derivative of the peptide. The next step iscleavage of the PTC derivative using anhydrous trifluoroacetic acid togive the anilinothiozolinone (ATZ) derivative of the N-terminal aminoacid, and the peptide with one fewer amino acid, which is free toundergo further couplings and cleavages. The ATZ residue is thenfiltered into the conversion flask, where it is converted to thephenylthiohydantoin (PTH) amino acid. This is a two step process. First,the ATZ derivative is hydrolyzed under aqueous, acidic conditions togive the PTC amino acid. The acid then cyclizes to give the stable PTHderivative. These derivatives are then injected into an high pressureliquid chromatography (HPLC) column where its retention time is comparedwith that of known PTH amino acid standards. The reaction is thenrepeated with the remaining C-terminus of the original peptide. Thus,each round of the Edman reaction identifies one further amino acidresidue in a protein.

X-Ray Diffraction Crystallography

In an embodiment, a protein sequence and structure may be studied usingX-ray diffraction crystallography. In this method, a crystal of theprotein is prepared. Methods of solubilizing and growing crystals ofmembrane proteins are described, for example, in U.S. Pat. No. 6,172,262to McQuade et al, and in U.S. Pat. No. 6,174,365 to Sanjoh. X-rays aredirected onto the crystal to produce diffracted beams, which aresubsequently detected by film or various electronic detectors. Thepattern of diffraction is determined in part by the atomic structures onwhich the incident X-rays impinge and from which they diffract. In acrystal, these atomic structures are regularly ordered, so that thediffracted X-rays form regular patterns of interference. A particulardiffraction pattern may therefore be associated with a particulararrangement of atoms. Thus, the appearance of a given diffractionpattern may suggest to one of ordinary skill in the art that the crystalbeing studied comprises the corresponding atomic structure.

Each atom in a crystal scatters x-rays in all directions, and only thosethat positively interfere with one another, according to Bragg's law,give rise to diffracted beams that can be recorded as a distinctdiffraction spot above background. Each diffraction spot is the resultof interference of all x-rays with the same diffraction angle emergingfrom all atoms. For example, for the protein crystal of myoglobin, eachof the about 20,000 diffracted beams that have been measured containscattered x-rays from each of the around 1500 atoms in the molecule.

Integral membrane proteins have traditionally been more difficult toobtain crystal structures from, but recent developments have made thisincreasingly possible and rapid. (Abramson et al. (1999).“Crystallization of membrane proteins” in “Crystallization of proteins:techniques, strategies and tips, a laboratory manual”. (Edited byBergfors, T.), International University Line, La Jolla, Calif. 199-210;Abramson, et al. (2000) Nat. Str. Biol. 7 (10); Byrne et al. (2000)Biochim. Biophys. Acta. 1459, 449-455; Iwata et al. (1995) Nature 376,660-669; Iwata et al. (1998) Science 281, 64-71; Michel et al (1982) J.Mol. Biol. 158, 567-572; Ostermeier, et al. (1995) Nature Str. Biol. 2,842-846; Landau et al. (1996) Proc. Natl. Acad. Sci. USA. 93,14532-14535).

Nuclear Magnetic Resonance

In an embodiment, NMR may be used to analyze the structure of membraneproteins. Briefly, the technique involves placing the material to beexamined (usually in a suitable solvent) in a powerful magnetic fieldand irradiating it with radio frequency (rf) electromagnetic radiation.The nuclei of the various atoms will align themselves with the magneticfield until energized by the rf radiation. They then absorb thisresonant energy and re-radiate it at a frequency dependent on i) thetype of nucleus and ii) its atomic environment. Moreover, resonantenergy can be passed from one nucleus to another, either through bondsor through three-dimensional space, thus giving information about theenvironment of a particular nucleus and nuclei in its vicinity.

Certain atoms are particularly well suited to analysis using NMR. Forexample, most early NMR work detected resonance energy from ¹H atoms.Over the past few years, labeling proteins with ¹⁵N and ¹⁵N/¹³C hasraised the analytical molecular size limit to approximately 15kiloDaltons (kD) and 40 kD, respectively. More recently, partialdeuteration of the protein in addition to ¹³C- and ¹⁵N-labeling hasincreased the size of proteins and protein complexes still further, toapproximately 60-70 kD. See Shan et al., J. Am. Chem. Soc., 118:6570-6579 (1996) and references cited therein.

Membrane Topology

The methods described herein may be used for determination of membranetopology. Labeling agent will bind only to those portions of proteinthat are exposed to the environment external to the membrane structure.Accordingly, the position of labeling agent on each protein creates arecord of which portions of the protein are exposed on the external faceof the membrane. The position of label on each protein may be determinedby, for example mass spectrometry analysis of digested, labeledproteins. Each fragment of a protein may be identified as labeled orunlabeled and assigned as an external or internal fragment,respectively. To be so assigned, a fragment should have at least oneamino acid that can react with the labeling reagent. Any fragment unableto react with labeling agent will of course not be labeled, and thetherefore can not be assigned topologically. It is anticipated that thismethodology would permit high-throughput determination of membranetopology by rapid analysis of fragmented, labeled proteins.

Delivery Systems for High-Throughput Identification

Each of the above-described methods for identifying the sequence and/orstructure of membrane proteins may be employed in a system designed forhigh-throughput identification. Techniques such as Liquidchromatography, Gas chromatography, Gel permeation chromatography, Sizeexclusion chromatography, Solid phase extraction, Capillaryelectrophoresis, and Capillary electrochromatography are all well-knownmethods for preparing and delivering analytical samples to XRC, NMR, MS,and Edman degradation devices.

5. Diagnostic Assays and Cell Surface Markers

In certain aspects, the invention provides methods for comparing thebiological states of cells by comparing the membrane surface proteinprofiles from different cell samples. In general, a comparative methodmay comprise treating a first sample with a labeling agent and treatinga second sample with a labeling agent. Each of the labeled samples isthen processed to produce a preparation of labeled cell surfaceproteins. A plurality of cell surface proteins from each preparation areanalyzed to identify the proteins, and, preferably, to obtainquantitative and/or qualitative information about each analyzed protein.The information obtained about each surface protein preparation forms aprofile, and the profiles from different samples may be compared toidentify differences and similarities between the samples.

In certain embodiments, profiles may be treated as fingerprints that areindicative, as a whole, of a particular sample type and its associatedbiological state. As an illustrative example, surface protein profilesfrom healthy tissues and cancerous tissues may be obtained and recorded.A sample of unknown health status may then be used to prepare a surfaceprotein profile, and this profile is compared against previouslyobtained profiles to determine whether the sample more closely matcheshealthy or cancerous tissue. In preferred embodiments, statisticalmethods are used to identify characteristics of surface protein profilesthat are most indicative of particular biological states. For example, asubset of surface proteins may be particularly associated with acancerous state. In this manner, methods of the invention may be used toidentify cell surface markers that are diagnostic of particularbiological states.

It is expected that essentially any two cells with different biologicalproperties will evince differences in cell surface protein composition.Accordingly, methods of the invention will be useful in profiling and/oridentifying cell surface markers for essentially any biological propertyof interest.

Exemplary biological states are presented herein solely for the purposesof illustration.

Cancer

Cancers, or neoplasms, develop through a series of stages including theinitial formation of a modified tumor cell, formation of a localizedtumor mass, development of invasive properties, and metastasis to distalsites. While the progression and genetic abnormalities of each tumor aredistinct, the progression of a tumor inevitably involves changes in geneexpression that result in differences in the complement of cell surfaceproteins. In addition, cancers that are classified within the same groupoften arise from distinct cell types that require different treatmentprotocols. Accordingly, the rapid identification of differences in cellsurface proteins will be useful for tumor identification, staging andtreatment selection, as well basic research into the mechanisms of tumorprogression.

For example, diagnosis and treatment of leukemias could be substantiallyimproved with the identification of additional cell surface markers.Acute leukemias are currently classified into those arising fromlymphoid precursors (acute lymphoblastic leukemia, ALL) and thosearising from myeloid precursors (acute myeloid leukemia). Thisclassification is made primarily on the basis of lymphoid- ormyeloid-specific cell surface markers, in combination with nuclearmorphology, periodic acid-Schiff base staining, and detection ofmyeloperoxidase. Although the distinction between AML and ALL is wellestablished, no single test is currently sufficient to establish thediagnosis. The selection of an appropriate treatment protocol dependsupon the correct identification of ALL or AML. Chemotherapy for AMLgenerally involves corticosteroids, vincristine, methotrexate, andL-asparaginase, whereas most AML regimens rely on daunorubicin andcytarabine (Pui et al. (1998) N. Engl. J. Med. 339: 605).

Several cell surface proteins are known to be useful in distinguishingALL from AML, including CD11c, CD33, and MB-1. Recent transcriptomeanalysis demonstrated that an additional membrane protein, leptinreceptor, is also differentially expressed (high expression in AML)(Golub et al. (1999) Science 286 (5439): 531-537). In addition, theleptin receptor may have a functional role in inhibiting apoptosis ofneoplastic cells, and thus represents a target for therapeuticintervention. The identification of further distinctive membraneproteins would clearly have benefits both for diagnostics and treatment,and provide an advantage over the use of transcriptome analysis becausethe direct analysis of proteins takes into account anypost-transcriptional regulation (Konopleva, et al. (1999) Blood 93:1668).

In another example, a variety of secreted and cell surface proteins areused in the identification of prostate cancers. The most commonlyutilized tests for prostate cancer are digital rectal examination andanalysis of serum prostate specific antigen (PSA). Although PSA has beenwidely used as a clinical marker of prostate cancer since 1988,screening programs utilizing PSA alone or in combination with digitalrectal examination have not been successful in improving the survivalrate for men with prostate cancer. While PSA is specific to prostatetissue, it is produced by normal and benign as well as malignantprostatic epithelium, resulting in a high false-positive rate forprostate cancer detection. Other markers that have been used forprostate cancer detection include prostatic acid phosphatase (PAP) andprostate secreted protein (PSP). PAP is secreted by prostate cells underhormonal control. It has less specificity and sensitivity than does PSA.As a result, it is used much less now, although PAP may still have someapplications for monitoring metastatic patients that have failed primarytreatments. In general, PSP is a more sensitive biomarker than PAP, butis not as sensitive as PSA. Like PSA, PSP levels are frequently elevatedin patients with BPH as well as those with prostate cancer. Anotherserum marker associated with prostate disease is prostate specificmembrane antigen (PSMA). PSMA is a Type II cell membrane protein and hasbeen identified as Folic Acid Hydrolase (FAH). Antibodies against PSMAreact with both normal prostate tissue and prostate cancer tissue(Horoszewicz et al., 1987). However, PSMA may have utility in certaincircumstances. PSMA is expressed in metastatic prostate tumor capillarybeds (Silver et al., 1997) and is reported to be more abundant in theblood of metastatic cancer patients (Murphy et al., 1996). Recently,prostate stem cell antigen (PSCA) was identified as a cell surfaceprotein that is overexpressed in prostate cancer cells. This marker hasproven useful in diagnosing prostate cancer, and monoclonal antibodiestargeting PSCA have shown some promise in treating prostate cancer inanimal models.

While many cell surface and secreted molecules related to prostatecancer have been identified, clear and reliable diagnostic markers havenot yet been identified. The rapid and large-scale identification ofcell surface proteins from normal and cancerous prostate cancer holdsgreat promise for the development of improved prostate cancerdiagnostics and therapeutics.

Viral Infections

In general, viruses may exist in several different states within thehost. The lysogenic lifecycle typically involves semi-stableincorporation of the viral genome into the host cell accompanied by anabsence or relatively low level of viral reproduction. The lyticlifecycle usually involves rapid replication of the viral genome,production of viral particles, viral maturation and host cell death.Viral infection results in a change in the host cell protein productionand these differences are reflected in the profile of cell surfaceproteins. Proteins differentially present at the cell surface may beuseful as targets for antiviral therapy and may also be used indiagnosing and staging viral infections.

For example, cytomegalovirus encodes two proteins, US2 and US11 thattarget MHC class I and class II molecules for degradation, substantiallydecreasing the amount of these critical immune recognition proteinspresent on the membranes of infected cells (Shamu et al. (1999) J CellBiol. 147(1): 45-58; Tomazin et al. (1999) Nat Med 5(9): 1039-43).Similarly, the Vpu protein of HIV target the host CD4 protein fordestruction through a ubiquitin and proteosome-dependent pathway(Schubert et al. (1998) J. Virol. 72(3): 2280-8). Thus, many virusesalter the complement of proteins present on the surface of the hostcell.

In addition, viral maturation recruits a number of viral proteins to thecell membrane for assembly into the newly forming virion. It is knownthat this process involves a number of host proteins, including theclathrin-mediated vesicle transport system and the ubiquitinationsystem. We predict that a number of host proteins will re-localize tothe cell surface during viral maturation. Such proteins may befunctionally important in viral maturation and may therefore be suitabletargets for antiviral therapy. Accordingly, the characterization of cellsurface protein profiles from cells at various stages of viral infectionwill be a powerful method for identifying proteins useful in treatmentand diagnosis of viral diseases.

Other infective states, such as intracellular bacterial pathogens andeukaryotic parasites are also anticipated to cause informative changesin cell surface protein composition.

Cell Surface Markers

Cell surface markers provided by the invention may be used in a varietyof methods for the separation or characterization of cell populations.In one embodiment of the invention, sample cells can be detected andquantified using a flow cytometer. Fluorescence activated cell sorting(FACS) flow cytometry is a common technique for antibody based celldetection and separation. Typically, detection and separation by flowcytometry is performed as follows. A sample containing the cells ofinterest is contacted with fluorochrome-conjugated antibodies, whichallows for the binding of the antibodies to one or more specific cellmarkers. The bound cells are washed, typically by one or morecentrifugation and resuspension steps. The cells are then run through aFACS device which separates the cells based on, among othercharacteristics, the different fluorescence properties imparted by thecell-bound fluorochrome. FACS systems are available in varying levels ofperformance and ability, including multicolor analysis which ispreferred in the present invention. For use of multiple cell surfacemarkers, it is preferable to use fluorochromes with distinguishablefluorescence properties. For a general review of flow cytometry, seeParks et al., 1986, Chapter 29: Flow Cytometry and fluorescenceactivated cell sorting (FACS) in: Handbook of Experimental Immunology,Volume 1: Immunochemistry, Weir et al. (eds.), Blackwell ScientificPublications, Boston, Mass.

Cell surface markers may also be used in other cell detection andseparation techniques. One such method is biotin-avidin based separationby affinity chromatography. Typically, such a technique is performed byincubating the sample of cells with biotin-conjugated antibodies to cellsurface markers of interest, followed by contact with an avidin-coatedsubstrate such as a column. Biotin-antibody-cell complexes bind to thecolumn via the biotin-avidin interaction, while other cells pass throughthe column. The specificity of the biotin-avidin system is well suitedfor rapid positive detection and separation. Once isolated, the cellscan be quantified and characterized as desired. Yet another method ismagnetic separation using antibody-coated magnetic beads. Kemmner etal., 1992, J. Immunol. Methods 147: 197-200; Racila et al., 1998, Proc.Natl. Acad. Sci. USA 95: 4589-4594. Another exemplary cell separationmethods involves the use of antibodies and protein A-coated substrates.In addition, in situ microscopy methods may be used to identify cellswith the markers of interest on their surfaces.

7. Computer and Database Systems

In certain aspects the invention provides computer systems andcomputer-assisted methods for analyzing membrane surface proteins. Acomputer system of the invention may comprise a database systemcomprising a plurality of records reflecting membrane surface proteinprofiles for different samples and a user interface allowing a user toselectively view information from each profile. In preferredembodiments, the database system will comprise, in addition to membranesurface protein profiles, linked entries reflecting the nature of thesample or cells from which each profile was obtained. For example, suchan entry may contain clinical information such as patient history,clinical diagnosis, clinical test results, prognosis, treatment regimenand outcome. Such an entry may also include information regardinggenotype of the subject or cells from which the sample was obtained. Forexample, cancers typically contain a number of chromosomal abnormalitiesand these may be reflected in a linked database entry. With respect toviral infections, linked entries may indicate the type of viralinfection. Other types of information that may be entered as linkedentries include, but are not limited to, levels of various transcriptsand levels of intracellular proteins.

A variety of software packages are available for data collection andanalysis. Preferred data analysis systems are able to scan 2D gels andassign different colors to different fluorophores present in the gels.This permits direct comparison of differentially-labeled proteinresolved on the same gel. For example, Z3 software for the analysis of2D gels is available from Compugen Inc.

8. Membrane Surface Markers and Screening Assays for Novel Therapeutics

In yet other aspects, the invention provides methods for identifying amembrane surface protein markers. Such methods may include obtaining acell surface protein profile from a cell type of interest, and comparingthat profile to other cell types to identify distinguishing markers forthe cell type of interest. For example, such methodology may be used toidentify stem cell-specific cell surface markers. Such markers may thenbe used to enrich for cells of interest. In a further illustrativeexample, a marker for infection with a particular virus may beidentified and used to identify subjects having infected cells. Markerproteins may be used to separate cells by Fluorescence Activated CellSorting (FACS) or other marker-based separation methods.

Markers and/or profiles may be used to screen for therapeutics. Cellsurface proteins associated with a disease state may be diminished oreliminated by treatment with certain test compounds. Such test compoundsmay be useful as therapeutics for the disease state. In addition,certain test compounds may increase the presence of cell surfaceproteins that are normally present on healthy cells but diminished orabsent in diseased cells. Such test compounds may also be useful astherapeutics. Particularly preferred therapeutics will cause the cellsurface protein profile of a diseased cell to more closely resemble thecell surface protein profile of a healthy cell.

In further embodiments, the differences between healthy and unhealthytissue samples may be analyzed to identify targets for therapeuticscreening, and a screen may be designed to identify compounds that bindor otherwise affect the activity of the given target. For example, asnoted above, leptin receptor is selectively overexpressed in certainleukemias. If, in fact, this overexpression leads to an increase in thelevel of leptin receptor present at the cell surface, therapeutics thatdisrupt the leptin receptor signaling pathway may be useful in treatingleukemias.

In certain embodiments, a method for selecting an appropriatetherapeutic for a subject is a computer-assisted method. Such a methodmay comprise obtaining a cell surface protein profile or measuring amarker protein in a sample from a subject. The output signal may then becompared against a database comprising output signal information from aplurality of subjects and further comprising clinical status informationfrom a plurality of subjects. It is contemplated that one may use acomputer interface to identify in the database any clinical conditionscorrelated with the protein profile or marker. Accordingly, one mayselect a targeted therapeutic to ameliorate or prevent the correlatedcondition.

EXAMPLES

The invention now being generally described, it will be more readilyunderstood by reference to the following examples, which are includedmerely for purposes of illustration of certain aspects and embodimentsof the present invention, and are not intended to limit the invention.

Example 1 Tagging of Cell Surface Proteins in Live Cells with EZ-LINKNHA-SS-Biotin

One set of HeLa cells is labeled with cleavable biotin, and a secondwith DMSO as control.

-   -   1. Wash cells three times with cold PBS.    -   2. Detach cells from 4 roller bottles with 50 ml PBS/5 mM EDTA        (prepared at room temperature) for 15 minutes at the incubator,        while rolling. Place in a 50 ml tubes and pellet cells at 1800        rpm, 4° C. for 5 minutes. Count cells.    -   3. Resuspend cells from all tubes in 50 ml PBS/CM and spin down        at 1800 rpm, 4° C. for 10 minutes.    -   4. Resuspend the cells at 25×10⁶ cells/ml in PBS/CM containing        0.5 mg/ml sulfo-biotin-NHS. Place cells in a 5 ml snap cup tube        and cover with aluminum foil.    -   5. Incubate with gentle shaking, in the cold cabinet, for 20        minutes. Spin down cells, 1500 rpm, 4° C. for 5 minutes.        Resuspend at 25×10⁶ cells/ml in 0.5 mg/ml PBS/CM containing 0.5        mg/ml sulfo-biotin-NHS. Incubate as before for 20 more minutes.    -   6. Transfer cell suspension to a 15 ml tube. Pellet cells as in        step 5. Quench reaction by gently resuspending cells in 5 ml of        50 mM glycine in PBS/CM. Incubate with gentle shaking for 10        minutes at 4° C.    -   7. Wash cells three times in PBS/CM, by centrifugation as in        step 5.    -   8. Resuspend cells in 2 ml solubilization buffer containing 50        mM Tris-HCl, pH7.6, 150 mM NaCl, 10% glycerol, 2% ASB14, 5 mM        EDTA, 1 mM EGTA, 1.5 mM MgCl₂, Protease inhibitors. Cell lysis        for membrane proteins is usually done with a buffer containing        0.5% Triton X-100. However, it was determined in our laboratory        that ASB-14 is a better solubilizing agent.    -   9. Incubate on ice for 30 minutes. Spin for 20 minutes at 14,000        rpm, 4° C.    -   10. Transfer supernatant to a fresh tubes.    -   11. To each of the tubes containing the supernatant add        streptavidin agarose beads (Pierce) (80 μl beads to 1 ml        solubilized extract). Incubate in the thermomixer, 1400 rpm, 1        hour, 4° C.    -   12. Spin down beads, 1 minute, 14,000 rpm, 4° C. Aspirate off        the supernatant.    -   13. Wash beads twice for 5 minutes, with gentle agitation, with        1 ml wash buffer 1, containing 20 mM Tris-HCl, pH7.6, 300 mM        NaCl, 10% glycerol, 0.1% Triton X-100, 0.1% SDS, 1 mM EDTA, 1 mM        EGTA, 1.5 mM MgCl₂, Protease inhibitors.    -   14. Wash once with 1 ml wash buffer 2 containing 20 mM Tris-HCl,        pH7.5, 10% glycerol, 0.1% Triton X-100, 1 mM EDTA, 1 mM EGTA,        1.5 mM MgCl₂, Protease inhibitors and phosphatase inhibitors.        Spin for 2 minutes at 14,000 rpm, 4° C. Discard sup.    -   15. To the final bead pellet add double bead volume reducing        solution containing 20 mm Tris-HCl, pH7.6, 50 mM TCEP-HCL, 150        mM NaCl, protease inhibitors and incubate at room temperature        for 2 hours.    -   16. At the end of the incubation spin at 14,000 rpm for 2        minutes. The supernatant contains the cell surface biotinylated        proteins ready for analysis. The sample can be analyzed by        SDS_PAGE as well as 2D electrophoresis. Since the final products        contain almost exclusively integral cell surface proteins it is        possible to analyze proteins by one dimension only. Furthermore,        the resulting proteins can be subjected to any form of        separation such as HPLC or FPLC which will be directly linked to        mass spectrometry analysis.

Example 2 Tagging of Cell Surface Proteins in Live Cell with AlexaFluor®488-NHS

There are several products of Alexa Fluor, each has a different emissionmaximum. Thus, cells can be treated differently and labeled withdifferent emitting Alexa Fluor reagents. The detection can be done withtwo lasers, one that will detect one fluorophore and the other thesecond. An image can then be generated and the proteins that are foundin both samples will give a different color then each flourphore alone.

One set of HeLa cells is labeled with Alexa Fluor®488-NHS, and a secondwith DMSO as control. All steps are performed on ice to preventinternalization of cell surface proteins.

-   -   1. Wash cells three times with cold PBS.    -   2. Detach cells from 4 roller bottles with 50 ml PBS/5 mM EDTA        (prepared at room temperature) for 15 minutes at the incubator,        while rolling. Place in a 50 ml tubes and pellet cells at 1800        rpm, 4° C. for 5 minutes. Count cells.    -   3. Resuspend cells from all tubes in 50 ml PBS/CM and spin down        at 1800 rpm, 4° C. for 10 minutes.    -   4. Resuspend the cells at 25×10⁶ cells/ml in PBS/CM containing        0.5 mg/ml Alexa Fluor®488-NHS. Place cells in a 5 ml snap cup        tube and cover with aluminum foil.    -   5. Incubate with gentle shaking, in the cold cabinet, for 20        minutes. Spin down cells, 1500 rpm, 4° C. for 5 minutes.        Resuspend at 25×10⁶ cells/ml in 0.5 mg/ml PBS/CM containing 0.5        mg/ml Alexa Fluor®488-NHS. Incubate as before for 20 more        minutes.    -   6. Transfer cell suspension to a 15 ml tube. Pellet cells as in        step 5. Quench reaction by gently resuspending cells in 5 ml of        50 mM glycine in PBS/CM. Incubate with gentle shaking for 10        minutes at 4° C.    -   7. Wash cells three times in PBS/CM, by centrifugation as in        step 5.    -   8. Aspirate supernatant gently with a Pasteur pipette hooked to        the vacuum pump. Measure the volume of the cell pellet by        comparing to an equivalent tube containing a known volume of        water measured by a pipetman. Resuspend in 3× cell volume of ice        cold lysis buffer (50 mM Tris-HCl, pH7.6, 15 mM KCl, 2 mM MgCl₂,        2 mM DTT, 1× protease inhibitor cocktail, incubate on ice for 30        minutes.    -   9. Subject the cells to two round of freeze-thaw cycles in        liquid nitrogen-37° C. water bath to break cell membrane.    -   10. Remove the unbroken cells and nuclei by centrifugation,        3000×g for 10 minutes at 4° C. Remove supernatant to a clean        eppendorf tube. Transfer supernatant to a fresh eppendorf tube.    -   11. Spin at 10,000×g for 30 minutes in an eppendorf centrifuge.        Remove supernatant (cytosol), the pellet is the membrane        fraction.    -   12. Wash the membrane to get rid of peripheral proteins. Set the        thermomixer at 700 rpm and 4° C. Resuspend the membrane pellet        with 45 μl lysis buffer and add 450 μl ice cold 0.1 M sodium        carbonate (stored at 4° C.) containing 1× protease inhibitors.        Place tubes in the thermomixer and mix for 1 hour.    -   13. Transfer protein mixture from step 5 to ultracentrifuge        120.1 tubes. Spin at 55,000 rpm, 4° C. for 20 minutes.    -   14. Remove supernatant (peripheral proteins; make sure that you        remove most of the supernatant). Resuspend the membrane pellet        in 400 μl ice cold 50 mM Tris-HCl, pH7.6 containing 1× protease        inhibitors. Spin at 55,000 rpm, 4° C. for 20 minutes. The        supernatant contains cell surface fluorescent proteins ready for        analysis. The sample can be analyzed by SDSPAGE as well as 2D        electrophoresis without the need for staining. The resulting        proteins can be subjected to any form of separation such as HPLC        or FPLC which will be directly linked to mass spectrometry        analysis.

Example 3 Cell Surface Protein Profiling Methodologies

The flow chart in FIG. 1 exemplifies several possible combinations ofcell surface protein labeling and identification techniques. A summaryof certain aspects of the illustrated methods is set forth below.

These exemplary methods begin with a selective labeling of cell surfaceproteins. The labeling method, when performed using a labeling agentthat binds to lysine residues, acidifies proteins, making isoelectricfocusing (and thereby 2D gel electrophoresis) possible for highly basicproteins. The labeled proteins are ultimately identified by massspectrometry analysis. Resolution of proteins for mass spectrometry maybe accomplished by chromatographic separations, 2D gel electrophoresisor 1D gel electrophoresis. 2D gel electrophoresis may also be used as apart of differential display method for identifying those proteins whoseexpression levels change in different conditions.

MS analysis provides a wealth of information including protein sequence.This information can fed into database records and used for generatingand analysing cell surface protein profiles obtained from a variety ofsources.

Example 4 Multiple Labeling Methods for Profiling Membrane SurfaceProteins

The flow chart in FIG. 2 exemplifies several possible combinations ofcell surface protein labeling and identification techniques. A summaryof certain aspects of the illustrated methods is set forth below.

The starting material may be either intact cells or other closedmembrane structures obtained from cells, such as organelles or vesicles.Such subcellular structures may be obtained by fractionation, forexample by sucrose density gradient centrifugation.

The starting material is treated with a labeling agent that reacts withamines and has a disulfide bond. In one variation, the labeling agenthas a marking moiety that is biotin, which is connected to the proteinbinding moiety through a disulfide bond. The biotinylated cell surfaceproteins may be enriched by passage over a streptavidin column. Whetherenriched or not, the labeled proteins can then be subjected to reducingconditions that break the disulfide bond. This process results inlabeled proteins having a free thiol at positions formerly having anamine. Because amines are generally more abundant than thiols inproteins, this method makes it possible to achieve much more efficientlabeling with thiol-reactive agents. This method is particularlyeffective with basic proteins because these proteins tend to have manyamines available for modification, and the modification processneutralizes these amines rendering the proteins more tractable toanalysis by isoelectric focusing. For low abundance proteins (which manymembrane proteins are), thiol-reactive labeling agents often giveinsufficient signal because of the low number of thiols per protein.This method greatly improves the density of label and detectability ofsuch low abundance proteins.

The modified proteins are then reacted with a second labeling agent thatis reactive with thiols. The labeling agent may be fluorescent orradioactive (including ICAT reagents). These labeled proteins are thenanalyzed by chromatography or gel electrophoresis and ultimatelyidentified by mass spectrometry. This data may then be fed into a datastorage and analysis system.

Example 5 General Protocol for Membrane Surface Protein Labeling UsingAmine Modifying Reagents

-   -   1. Prepare suspension of 10⁶-10⁸ cells/ml in a PBS solution (10        mM sodium phosphate, 0.15M NaCl, pH=7.4)    -   2. A water soluble amine modifying reagent (labeling agent) may        be dissolved directly in an isotonic buffer which does not        contain primary amines. Depending on solubility, the reagent may        be dissolved in N′N-Dimethylformamide (DMF), anhydrous.    -   3. Attach amine modifying reagents to cell surface proteins,        estimating a 10:1 ratio of labeling agent to membrane surface        protein. This number may need to be optimized for different        closed membrane surfaces. Published protocols are also        available: [Prolinx: Protocol VER#5000-1; VER#5000-1; VER#1015]        -   a. Incubate the cells in amine modifying reagent at 4            degrees for 1 hour.        -   b. Modifier solution should be prepared fresh right before            the use.        -   c. Reaction conditions can vary depending on the cells and            therefore may be optimized.    -   4. Add glycine for the removal of excess non reactive tag    -   5. Solubilize cell membrane using 1% triton X-100 and remove        nuclear fraction.    -   6. Add detergent (for example SDS to 1%) for full membrane        solubilization. The detergent for membrane solubilization must        be compatible with the tagging reagents.    -   7. Separate labeled proteins using Agarose chromatography.        Published protocol: [Prolinx protocol #VER1020]. Or SPM-HC        separation beads. Published protocol [Prolinx protocol #Ver1026]    -   8. Elute the labeled proteins.    -   9. The labeled proteins can be concentrated from the elution        solution using TCA precipitation and re-solubilization in the        following buffers according to the need:        -   a. Lamelly buffer for SDS page        -   b. Solubilization buffer for 2D gels for example “Proteomem”            from Sigma (St. Louis, Mo.)        -   c. Separation in LC-MS

Example 6 General Protocol for Labeling Membrane Surface GlycoproteinsUsing a Carbohydrate Modifying Reagent

-   -   1. Prepare suspension of 10⁶-10⁸ cells/ml in a PBS solution (10        mM sodium phosphate, 0.15M NaCl, pH=7.4    -   2. Oxidizing the cell surface glycans. Oxidation conditions        should be optimized based on the cells.        -   Chemical oxidation using NaIO₄ [GlycoTrack™ Glycoprotein            detection kit K-050] protocol 1a for labeling on membrane            and protocol 2a for labeling in solution.        -   Protocol for Labeling Cell Surfaces glycoproteins and            protocol for Labeling of Glycoproteins in solution            “Bioconjugate Techniques”, G T Hermanson, Academic Press            314-315 (1996).    -   3. Attach carbohydrate modifying reagents to cell surface        glycoproteins using one of the following protocol “Bioconjugate        Techniques”, G T Hermanson, Academic Press 314-315 (1996).        -   a. Incubate the cells in amine modifying reagent at 4            degrees for 1 hour in the dark.        -   b. Modifier solution should be prepared fresh right before            the use.        -   c. Reaction conditions can vary depending on the cells and            therefore must be optimized.    -   4. Add small amount of glycerol for the removal of excess of non        reactive tag    -   5. Solubilize cell membrane using 1% triton X-100 and remove        nuclear    -   6. Add detergent (for example SDS to 1%) for full membrane        solubilization. The detergent for membrane solubilization must        be compatible with the tagging reagents.    -   7.        -   Separate labeled proteins using Agarose chromatography.            Published protocol: [Prolinx protocol #VER1020]. Or SPM-HC            separation beads. Published protocol [Prolinx protocol            #Ver1026]        -   Alternatively reduce the disulfide bonds using DTT or other            reducing agent. And then separate the proteins from the mix            using size exclusion chromatography.    -   8.        -   Elute the labeled proteins.        -   Alternatively reduce the disulfide bonds using DTT or other            reducing agent. And then separate the proteins from the mix            using size exclusion chromatography.        -   Alternatively de-glycosylate the labeled glycoprotein.    -   9. The proteins can be concentrated if needed from the elution        solution using TCA precipitation and re-solubilization in a        suitable buffer for the separation system:        -   Laemmli buffer for SDS page        -   Solubilization buffer for 2D gels “Proteomem” from Sigma        -   Separation in LC-MS

Information and published protocols for Examples 5 and 6 may be found atthe following websites and such information available as of theapplication filing date is herein incorporated by reference:

-   -   http://www.prolinx.com/    -   http://www.prozyme.com/glycopro/    -   http://www.hamptonresearch.com/catalog.html    -   http://www.europa-bioproducts.com/catagory.asp?MyVar=Kits    -   http://informagen.com/Resource_Informagen/Deprecated/868.html    -   http://www.htscreening.net/suppliers/readdet.html    -   http://scooter.cyto.purdue.edu/pucl_cd/flow/vol4/7_spon/biorad/tree.htm

Example 7 Exemplary Scheme for Synthesis of a Labeling Agent Comprisinga BPA Group and a Disulfide Bond

Example 8 Exemplary Preparation of NHS-Sulfonate Esters from CarboxylicAcids

NHS and NHS-Sulfonate are creating active esters which are used forcoupling of amine to carboxile.

The attachment of NHS-S to the carboxylic acid can be done either duringthe coupling of the amine to the carboxyl or before the coupling.

Sulfo-NHS is reactive against amines in the same way as the NHS ester,however it's water resistant to hydrolysis is substatialy better. Sincemost reactions of coupling biomolecules are preformed in a waterenvironment the advantage of using Sulfo-NHS is clear. The stability andwater solubility are enabeling us to solubilize this material directlyin a buffer without the need to previously solubilize in a dry organicsolvent such as DMF (Dimethylformamide).

Preparation of NHS and Sulfo-NHS Active Esters for Amine Coupling:

Incorporation by Reference

All publications and patents mentioned herein, including those itemslisted below, are hereby incorporated by reference in their entirety asif each individual publication or patent was specifically andindividually indicated to be incorporated by reference. In case ofconflict, the present application, including any definitions herein,will control.

Equivalents

While specific embodiments of the subject invention have been discussed,the above specification is illustrative and not restrictive. Manyvariations of the invention will become apparent to those skilled in theart upon review of this specification. The appended claims are notintended to claim all such embodiments and variations, and the fullscope of the invention should be determined by reference to the claims,along with their full scope of equivalents, and the specification, alongwith such variations.

1. A method of selectively labeling membrane surface proteinscomprising: (a) contacting closed membrane structures with a firstlabeling agent, thereby generating a plurality of primary labeledmembrane surface proteins, wherein said first labeling agent comprises adisulfide bond; (b) reducing said disulfide bond to produce primarylabeled membrane surface proteins having free thiols; (c) contactingsaid primary labeled membrane surface proteins with a second labelingagent, thereby generating a plurality of secondary labeled membranesurface proteins, wherein said second labeling agent comprises athiol-reactive protein binding moiety; (d) separating said plurality ofsecondary labeled membrane surface proteins from proteins not having asecondary label to obtain selectively labeled membrane surface proteins.2. A method for generating a cell surface protein profile, comprising:(a) contacting cells with a labeling agent, thereby generating aplurality of labeled cell surface proteins; (b) separating saidplurality of labeled cell surface proteins from unlabeled proteins; and(c) identifying said labeled cell surface proteins separated in step(b), wherein the cell surface protein profile comprises the identity ofthe labeled cell surface proteins identified in step (c) and whereinfurther said labeling agent is selected from the group consisting of:

wherein R is present 1 to 4 times and is selected from the groupconsisting of —B(OH)₂,

D is selected from the group consisting of O, S, and NH; Q is selectedfrom the group consisting of OR₂, NHR₂, NHOR₂, and CH₂-EWG, wherein EWGis an electron withdrawing group, such as CN, COOH, etc.; W is selectedfrom the group consisting of N(R₂)CO, CON(R₂), N(R₂)COC(R₂)₂,CON(R₂)C(R₂)₂, O, OC(R₂)₂, S, and S(R₂)₂; Z is selected from the groupconsisting of a saturated or unsaturated chain up to about 6 carbonequivalents in length, unbranched saturated or unsaturated chain of fromabout 6 to 18 carbon equivalents in length with at least oneintermediate amide or disulfide moiety, and a polyethylene glycol chainof from about 3 to 12 carbon equivalents in length; R₁ is a reactiveelectrophilic or nucleophilic moiety; R₂ is H, alkyl, or aryl; and R₃ ispresent 1 or 2 times and is OH.
 3. A method for identifying cell surfaceproteins, comprising: (a) contacting cells with a labeling agent,thereby generating a plurality of labeled cell surface proteins; (b)separating said plurality of labeled cell surface proteins fromunlabeled proteins; and (c) identifying separated labeled cell surfaceproteins; wherein further said labeling agent is selected from the groupconsisting of:

wherein R is present 1 to 4 times and is selected from the groupconsisting of —B(OH)₂,

D is selected from the group consisting of O, S, and NH; Q is selectedfrom the group consisting of OR₂, NHR₂, NHOR₂, and CH₂-EWG, wherein EWGis an electron withdrawing group, such as CN, COOH, etc.; W is selectedfrom the group consisting of N(R₂)CO, CON(R₂), N(R₂)COC(R₂)₂,CON(R₂)C(R₂)₂, O, OC(R₂)₂, S, and S(R₂)₂; Z is selected from the groupconsisting of a saturated or unsaturated chain up to about 6 carbonequivalents in length, unbranched saturated or unsaturated chain of fromabout 6 to 18 carbon equivalents in length with at least oneintermediate amide or disulfide moiety, and a polyethylene glycol chainof from about 3 to 12 carbon equivalents in length; R₁ is a reactiveelectrophilic or nucleophilic moiety; R₂ is H, alkyl, or aryl; and R₃ ispresent 1 or 2 times and is OH.
 4. The method of claim 1, wherein saidlabeling agent comprises a marking moiety and a protein binding moiety.5. The method of claim 1, wherein said first labeling agent is selectedfrom the group consisting of:

wherein R is present 1 to 4 times and is selected from the groupconsisting of —B(OH)₂,

D is selected from the group consisting of O, S, and NH; Q is selectedfrom the group consisting of OR₂, NHR₂, NHOR₂, and CH₂-EWG, wherein EWGis an electron withdrawing group, such as CN, COOH, etc.; W is selectedfrom the group consisting of N(R₂)CO, CON(R₂), N(R₂)COC(R₂)₂,CON(R₂)C(R₂)₂, O, OC(R₂)₂, S, and S(R₂)₂; Z is an unbranched saturatedor unsaturated chain of from about 6 to 18 carbon equivalents in lengthwith at least one disulfide moiety; R₁ is a reactive electrophilic ornucleophilic moiety; R₂ is H, alkyl, or aryl; and R₃ is present 1 or 2times and is OH.
 6. The method of claim 1, wherein said second labelingagent is fluorescent.
 7. The method of claim 1, wherein said secondlabeling agent is radioactive.
 8. The method of any one of claims 1, 2,or 3, wherein said cells are eukaryotic cells.
 9. The method of claim 8,further comprising washing said eukaryotic cells with a divalent ionchelator to remove extracellular matrix.
 10. The method of claim 9,wherein said divalent ion chelator is EDTA.
 11. The method of any one ofclaims 1, 2, or 3, wherein said plurality of labeled cell surfaceproteins are separated by one-dimensional SDS polyacrylamide gelelectrophoresis.
 12. The method of any one of claims 1, 2, or 3, whereinsaid plurality of labeled cell surface proteins are separated bytwo-dimensional electrophoresis.
 13. The method of any one of claims 1,2, or 3, wherein said labeled cell surface proteins are identified bymass spectrometry.
 14. The method of any one of claims 1, 2, or 3,wherein at least five proteins are identified.
 15. A method ofclassifying a disease state of a test cell sample comprising: (a)contacting cells obtained from said test cell sample with a labelingagent, thereby generating a plurality of labeled cell surface proteins;(b) separating said plurality of labeled cell surface proteins fromunlabeled proteins; and (c) identifying said labeled cell surfaceproteins separated in step (b); (d) preparing a test cell surfaceprotein profile, said profile comprising the identity of the labeledmembrane surface proteins identified in step (c); (d) comparing saidtest sample cell surface protein profile to a plurality of referencecell surface protein profiles obtained from reference cell samples,wherein said disease state of the test cell sample is classified basedon similarities and differences of the test cell surface protein profilewith the reference cell surface protein profiles.
 16. A method of claim15, wherein said test cell sample is suspected of having cancerouscells, and wherein at least one of said reference cell surface proteinprofiles is obtained from a reference cell sample having cancerouscells.
 17. A method of claim 15, wherein said test cell sample issuspected of having cells infected with a virus, and wherein at leastone of said reference cell surface protein profiles is obtained from areference cell sample having cells infected with a virus.
 18. A methodof generating a disease-specific cell surface protein profilecomprising, (a) contacting cells obtained from a diseased cell samplewith a labeling agent, thereby generating a plurality of labeled cellsurface proteins; (b) separating said plurality of labeled cell surfaceproteins from unlabeled proteins; and (c) identifying said labeled cellsurface proteins separated in step (b); (d) preparing a diseased cellsurface protein profile, said profile comprising the identity thelabeled cell surface proteins identified in step (c); (e) comparing saiddiseased cell surface protein profile to a control cell surface proteinprofile obtained from a control cell sample, wherein thedisease-specific cell surface protein profile comprises the identity ofat least one protein that differs significantly in abundance orpost-translational modification in the diseased cell sample as comparedto the control cell sample.
 19. A method of identifying adisorder-specific cell surface marker protein comprising, (a) contactingcells obtained from a disordered cell sample with a labeling agent,thereby generating a plurality of labeled cell surface proteins; (b)separating said plurality of labeled cell surface proteins fromunlabeled proteins; and (c) identifying separated labeled cell surfaceproteins; (d) preparing a diseased cell surface protein profile, saidprofile comprising the identity of said labeled cell surface proteinsidentified in step (c); (e) comparing said diseased cell surface proteinprofile to at least one control cell surface protein profile obtainedfrom a control cell sample, wherein any protein that differssignificantly in abundance or post-translational modification in thediseased cell sample as compared to the control cell sample is adisease-specific cell surface marker.
 20. The method of any one ofclaims 15, 18, or 19, wherein said labeling agent is selected from thegroup consisting of:

wherein R is present 1 to 4 times and is selected from the groupconsisting of —B(OH)₂,

D is selected from the group consisting of O, S, and NH; Q is selectedfrom the group consisting of OR₂, NHR₂, NHOR₂, and CH₂-EWG, wherein EWGis an electron withdrawing group, such as CN, COOH, etc.; W is selectedfrom the group consisting of N(R₂)CO, CON(R₂), N(R₂)COC(R₂)₂,CON(R₂)C(R₂)₂, O, OC(R₂)₂, S, and S(R₂)₂; Z is selected from the groupconsisting of a saturated or unsaturated chain up to about 6 carbonequivalents in length, unbranched saturated or unsaturated chain of fromabout 6 to 18 carbon equivalents in length with at least oneintermediate amide or disulfide moiety, and a polyethylene glycol chainof from about 3 to 12 carbon equivalents in length; R₁ is a reactiveelectrophilic or nucleophilic moiety; R₂ is H, alkyl, or a aryl; and R₃is present 1 or 2 times and is OH.
 21. The method of any one of claims15, 18, or 19, wherein said labeling agent is lectin.
 22. A method ofclaim 15, 18 or 19, wherein said closed membrane structure is anorganelle, a membrane vesicle or a cell.
 23. A labeling agentrepresented by structure 1:

wherein: R is present 1 to 4 times; R is selected from the groupconsisting of —B(OH)₂,

W is a linker selected from the group consisting of N(R₂)CO, CON(R₂),N(R₂)COC(R₂)₂, CON(R₂)C(R₂)₂, O, OC(R₂)₂, S, and S(R₂)₂; Z is a spacerselected from the group consisting of an unbranched saturated orunsaturated chain of from about 6 to 18 carbon equivalents in lengthwith at least one intermediate amide or disulfide moiety and apolyethylene glycol chain of from about 3 to 12 carbon equivalents inlength; R₁ is a reactive electrophilic or nucleophilic moiety suitablefor reaction of the PDAB (phenyldiboronic acid) with a protein; and R₂is H, alkyl, or aryl.
 24. The labeling agent of claim 23, wherein Zcontains a disulfide moiety.
 25. The labeling agent of claim 23, whereinR is —B(OH)₂, W is NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, and R₁ is a hydrazide of structure A:


26. The labeling agent of claim 23, wherein R is

W is NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1to 6 inclusively, and R₁ is a hydrazide of structure A.
 27. The labelingagent of claim 23, wherein R is

W is NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1to 6 inclusively, and R₁ is a hydrazide of structure A.
 28. The labelingagent of claim 23, wherein R is —B(OH)₂, W is CONH, Z is(CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6 inclusively,and R₁ is a hydrazide of structure A.
 29. The labeling agent of claim23, wherein R is

W is CONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1to 6 inclusively, and R₁ is a hydrazide of structure A.
 30. The labelingagent of claim 23, wherein R is

W is CONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1to 6 inclusively, and R₁ is a hydrazide of structure A.
 31. The labelingagent of claim 23, wherein R is —B(OH)₂, W is CH₂NHCO, Z is(CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6 inclusively,and R₁ is a hydrazide of structure A.
 32. The labeling agent of claim23, wherein R is

W is CH₂NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from1 to 6 inclusively, and R₁ is a hydrazide of structure A.
 33. Thelabeling agent of claim 23, wherein R is

W is CH₂NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from1 to 6 inclusively, and R₁ is a hydrazide of structure A.
 34. Thelabeling agent of claim 23, wherein R is —B(OH)₂, W is CH₂NHCO, Z is(CH₂)_(n)C(O)NH(CH₂)_(n) wherein n is an integer from 1 to 6inclusively, and R₁ is a hydroxysulfo-succinimidyl ester of structure B:


35. The labeling agent of claim 23, wherein R is —B(OH)₂, W is CH₂NHCO,Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6inclusively, and R₁ is a hydroxysulfo-succinimidyl ester of structure B.36. The labeling agent of claim 23, wherein R is

W is CH₂NHCO, Z is (CH₂)_(n)C(O)NH(CH₂)_(n) wherein n is an integer from1 to 6 inclusively, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.
 37. The labeling agent of claim 23, wherein R is

W is CH₂NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from1 to 6 inclusively, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.
 38. The labeling agent of claim 23, wherein R is

W is CONH, Z is (CH₂)₅, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.
 39. The labeling agent of claim 23, wherein R is —B(OH)₂, Wis CONH, Z is (CH₂)₅, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.
 40. The labeling agent of claim 23, wherein R is

W is NHCO, Z is (CH₂)₂C(O)NH(CH₂)₅, and R₁ is ahydroxysulfo-succinimidyl ester of structure B.
 41. The labeling agentof claim 23, wherein R is

W is NHCO, Z is (CH₂)₂, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.
 42. A labeling agent represented by structure 2:

wherein: R₃ is present 1 or 2 times and is OH; D is selected from thegroup consisting of O, S, and NH; Q is selected from the groupconsisting of OR₂, NHR₂, NHOR₂, and CH₂-EWG, wherein EWG is an electronwithdrawing group, such as CN, COOH, etc.; W is a linker selected fromthe group consisting of N(R₂)CO, CON(R₂), N(R₂)COC(R₂)₂, CON(R₂)C(R₂)₂,O, OC(R₂)₂, S, and S(R₂)₂; Z is a spacer selected from the groupconsisting of unbranched saturated or unsaturated chain of from about 6to 18 carbon equivalents in length with at least one intermediate amideor disulfide moiety and a polyethylene glycol chain of from about 3 to12 carbon equivalents in length; R₁ is a reactive electrophilic ornucleophilic moiety; and R₂ is H, alkyl or aryl.
 43. The labeling agentof claim 42, wherein Z contains a disulfide moiety.
 44. The labelingagent of claim 42, wherein R is present one time W is NHCO, Z is(CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6 inclusively,Q is OR₂ and R₁ is a hydrazide of structure A:


45. The labeling agent of claim 42, wherein R is present one time, W isNHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is NHOR₂, and R₁ is a hydrazide of structure A.
 46. Thelabeling agent of claim 42, wherein R is present two times, W is NHCO, Zis (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is OR₂, and R₁ is a hydrazide of structure A.
 47. Thelabeling agent of claim 42, wherein R is present two times, W is NHCO, Zis (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is NHOR₂, and R₁ is a hydrazide of structure A.
 48. Thelabeling agent of claim 42, wherein R is present one time, W is CONH, Zis (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is OR₂, and R₁ is a hydrazide of structure A.
 49. Thelabeling agent of claim 42, wherein R is present one time, W is CONH, Zis (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is NHOR₂, and R₁ is a hydrazide of structure A.
 50. Thelabeling agent of claim 42, wherein R is present two times, W is CONH, Zis (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is OR₂, and R₁ is a hydrazide of structure A.
 51. Thelabeling agent of claim 42, wherein R is present two times, W is CONH, Zis (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is NHOR₂, and R₁ is a hydrazide of structure A.
 52. Thelabeling agent of claim 42, wherein R is present one time W is NHCO, Zis (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is OR₂ and R₁ is a hydrazide of structure B:


53. The labeling agent of claim 42, wherein R is present one time, W isNHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is NHOR₂, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.
 54. The labeling agent of claim 42, wherein R is presenttwo times, W is NHCO, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is OR₂, and R₁ is ahydroxysulfo-succinimidyl ester of structure B.
 55. The labeling agentof claim 42, wherein R is present two times, W is NHCO, Z is(CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6 inclusively,Q is NHOR₂, and R₁ is a hydroxysulfo-succinimidyl ester of structure B.56. The labeling agent of claim 42, wherein R is present one time, W isCONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is OR₂, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.
 57. The labeling agent of claim 42, wherein R is presentone time, W is CONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is aninteger from 1 to 6 inclusively, Q is NHOR₂, and R₁ is ahydroxysulfo-succinimidyl ester of structure B.
 58. The labeling agentof claim 42, wherein R is present two times, W is CONH, Z is(CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6 inclusively,Q is OR₂, and R₁ is a hydroxysulfo-succinimidyl ester of structure B.59. The labeling agent of claim 42, wherein R is present two times, W isCONH, Z is (CH₂)_(n)—S—S—(CH₂)_(n) wherein n is an integer from 1 to 6inclusively, Q is NHOR₂, and R₁ is a hydroxysulfo-succinimidyl ester ofstructure B.
 60. The method of claim 2, further comprising: affixing toa solid substrate an agent that binds to the marking moiety of thelabeling reagent to generate a affinity-prepared substrate; andcontacting the affinity-prepared substrate with the labeled membranesurface proteins, thereby generating an array of membrane surfaceproteins affixed to a solid substrate.
 61. The method of claim 60,further comprising: performing a mass spectrometry analysis of aplurality of the membrane surface proteins affixed to the solid surface.62. A linking agent represented by the structure:


63. A linking agent represented by the structure:


64. A linking agent represented by the structure:


65. A linking agent represented by the structure:


66. A linking agent represented by the structure:


67. A linking agent represented by the structure:


68. A linking agent represented by the structure: